diff mbox series

[RFC,v2,2/2] ELF: Add ELF program property parsing support

Message ID 1566581020-9953-3-git-send-email-Dave.Martin@arm.com
State Superseded
Headers show
Series ELF: Alternate program property parser | expand

Commit Message

Dave Martin Aug. 23, 2019, 5:23 p.m. UTC
ELF program properties will needed for detecting whether to enable
optional architecture or ABI features for a new ELF process.

For now, there are no generic properties that we care about, so do
nothing unless CONFIG_ARCH_USE_GNU_PROPERTY=y.

Otherwise, the presence of properties using the PT_PROGRAM_PROPERTY
phdrs entry (if any), and notify each property to the arch code.

For now, the added code is not used.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>


---

Changes since RFC v1:

 * Fix stupid typo in IS_ENABLED().

 * Fix premature dereference of possibly-NULL phdr in
   parse_elf_properties().

 * Demote BUG_ON() to WARN_ON(), and add comments to explain why they
   should never fire.

   These are only for development and should probably go away before
   merge.

 * Enforce that there are no duplicate properties and that the
   properties are sorted on pr_type.

   This is not strictly necessary for kernel safety or for correct
   handling of valid ELF files, but may help flag up binaries from duff
   linkers which we would otherwise silently execute.

 * Shuffle parse_elf_properties() logic to move as many checks as
   possible into parse_elf_property().  This keeps most checks in the
   same function as the code that relies on them, and simplifies the
   outer parsing loop.

   parse_elf_properties's cursor is now just a byte offset into the
   note, which makes comparisons more straightfward and reduces the
   amount of pointer casting a bit.
---
 fs/binfmt_elf.c          | 124 +++++++++++++++++++++++++++++++++++++++++++++++
 fs/compat_binfmt_elf.c   |   4 ++
 include/linux/elf.h      |  19 ++++++++
 include/uapi/linux/elf.h |   4 ++
 4 files changed, 151 insertions(+)

-- 
2.1.4

Comments

Kees Cook Aug. 30, 2019, 5:37 a.m. UTC | #1
On Fri, Aug 23, 2019 at 06:23:40PM +0100, Dave Martin wrote:
> ELF program properties will needed for detecting whether to enable

> optional architecture or ABI features for a new ELF process.

> 

> For now, there are no generic properties that we care about, so do

> nothing unless CONFIG_ARCH_USE_GNU_PROPERTY=y.

> 

> Otherwise, the presence of properties using the PT_PROGRAM_PROPERTY

> phdrs entry (if any), and notify each property to the arch code.

> 

> For now, the added code is not used.

> 

> Signed-off-by: Dave Martin <Dave.Martin@arm.com>


Reviewed-by: Kees Cook <keescook@chromium.org>


Note below...

> [...]

> +static int parse_elf_property(const char *data, size_t *off, size_t datasz,

> +			      struct arch_elf_state *arch,

> +			      bool have_prev_type, u32 *prev_type)

> +{

> +	size_t size, step;

> +	const struct gnu_property *pr;

> +	int ret;

> +

> +	if (*off == datasz)

> +		return -ENOENT;

> +

> +	if (WARN_ON(*off > datasz || *off % elf_gnu_property_align))

> +		return -EIO;

> +

> +	size = datasz - *off;

> +	if (size < sizeof(*pr))

> +		return -EIO;

> +

> +	pr = (const struct gnu_property *)(data + *off);

> +	if (pr->pr_datasz > size - sizeof(*pr))

> +		return -EIO;

> +

> +	step = round_up(sizeof(*pr) + pr->pr_datasz, elf_gnu_property_align);

> +	if (step > size)

> +		return -EIO;

> +

> +	/* Properties are supposed to be unique and sorted on pr_type: */

> +	if (have_prev_type && pr->pr_type <= *prev_type)

> +		return -EIO;

> +	*prev_type = pr->pr_type;

> +

> +	ret = arch_parse_elf_property(pr->pr_type,

> +				      data + *off + sizeof(*pr),

> +				      pr->pr_datasz, ELF_COMPAT, arch);


I find it slightly hard to read the "cursor" motion in this parse. It
feels strange, for example, to refer twice to "data + *off" with the
second including consumed *pr size. Everything is fine AFAICT in the math,
though, and I haven't been able to construct a convincingly "cleaner"
version. Maybe:

	data += *off;
	pr = (const struct gnu_property *)data;
	data += sizeof(*pr);
	...
	ret = arch_parse_elf_property(pr->pr_type, data,
				      pr->pr_datasz, ELF_COMPAT, arch);

But that feels disjoint from the "step" calculation, so... I think what
you have is fine. :)

-- 
Kees Cook
Dave Martin Aug. 30, 2019, 8:34 a.m. UTC | #2
On Fri, Aug 30, 2019 at 06:37:45AM +0100, Kees Cook wrote:
> On Fri, Aug 23, 2019 at 06:23:40PM +0100, Dave Martin wrote:

> > ELF program properties will needed for detecting whether to enable

> > optional architecture or ABI features for a new ELF process.

> > 

> > For now, there are no generic properties that we care about, so do

> > nothing unless CONFIG_ARCH_USE_GNU_PROPERTY=y.

> > 

> > Otherwise, the presence of properties using the PT_PROGRAM_PROPERTY

> > phdrs entry (if any), and notify each property to the arch code.

> > 

> > For now, the added code is not used.

> > 

> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>

> 

> Reviewed-by: Kees Cook <keescook@chromium.org>


Thanks for the review.

Do you have any thoughts on Yu-Cheng Yu's comments?  It would be nice to
early-terminate the scan if we can, but my feeling so far was that the
scan is cheap, the number of properties is unlikely to be more than a
smallish integer, and the code separation benefits of just calling the
arch code for every property probably likely outweigh the costs of
having to iterate over every property.  We could always optimise it
later if necessary.

I need to double-check that there's no way we can get stuck in an
infinite loop with the current code, though I've not seen it in my
testing.  I should throw some malformed notes at it though.

> Note below...

> 

> > [...]

> > +static int parse_elf_property(const char *data, size_t *off, size_t datasz,

> > +			      struct arch_elf_state *arch,

> > +			      bool have_prev_type, u32 *prev_type)

> > +{

> > +	size_t size, step;

> > +	const struct gnu_property *pr;

> > +	int ret;

> > +

> > +	if (*off == datasz)

> > +		return -ENOENT;

> > +

> > +	if (WARN_ON(*off > datasz || *off % elf_gnu_property_align))

> > +		return -EIO;

> > +

> > +	size = datasz - *off;

> > +	if (size < sizeof(*pr))

> > +		return -EIO;

> > +

> > +	pr = (const struct gnu_property *)(data + *off);

> > +	if (pr->pr_datasz > size - sizeof(*pr))

> > +		return -EIO;

> > +

> > +	step = round_up(sizeof(*pr) + pr->pr_datasz, elf_gnu_property_align);

> > +	if (step > size)

> > +		return -EIO;

> > +

> > +	/* Properties are supposed to be unique and sorted on pr_type: */

> > +	if (have_prev_type && pr->pr_type <= *prev_type)

> > +		return -EIO;

> > +	*prev_type = pr->pr_type;

> > +

> > +	ret = arch_parse_elf_property(pr->pr_type,

> > +				      data + *off + sizeof(*pr),

> > +				      pr->pr_datasz, ELF_COMPAT, arch);

> 

> I find it slightly hard to read the "cursor" motion in this parse. It

> feels strange, for example, to refer twice to "data + *off" with the

> second including consumed *pr size. Everything is fine AFAICT in the math,

> though, and I haven't been able to construct a convincingly "cleaner"

> version. Maybe:

> 

> 	data += *off;

> 	pr = (const struct gnu_property *)data;

> 	data += sizeof(*pr);

> 	...

> 	ret = arch_parse_elf_property(pr->pr_type, data,

> 				      pr->pr_datasz, ELF_COMPAT, arch);


Fair point.  The cursor is really *off, which I suppose I could update
as we go through this function, or cache in a local variable and assign
on the way out.

> But that feels disjoint from the "step" calculation, so... I think what

> you have is fine. :)


It's good to be as clear as possible about exactly how far we have
parsed, so I'll see if I can improve this when I repost.


Do you have any objection to merging patch 1 with this one?  For
upstreaming purposes, it seems overkill for that to be a separate patch.

I may also convert elf_gnu_property_align to upper case, since unlike
the other related definitions this one isn't trying to look like a
struct tag.

Do you have any opinion on the WARN_ON()s?  They should be un-hittable,
so they're documenting assumptions rather than protecting against
anything real.  Maybe I should replace them with comments.

Cheers
---Dave
Yu-cheng Yu Aug. 30, 2019, 5:03 p.m. UTC | #3
On Fri, 2019-08-30 at 09:34 +0100, Dave Martin wrote:
> On Fri, Aug 30, 2019 at 06:37:45AM +0100, Kees Cook wrote:

> > On Fri, Aug 23, 2019 at 06:23:40PM +0100, Dave Martin wrote:

> > > ELF program properties will needed for detecting whether to enable

> > > optional architecture or ABI features for a new ELF process.

> > > 

> > > For now, there are no generic properties that we care about, so do

> > > nothing unless CONFIG_ARCH_USE_GNU_PROPERTY=y.

> > > 

> > > Otherwise, the presence of properties using the PT_PROGRAM_PROPERTY

> > > phdrs entry (if any), and notify each property to the arch code.

> > > 

> > > For now, the added code is not used.

> > > 

> > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>

> > 

> > Reviewed-by: Kees Cook <keescook@chromium.org>

> 

> Thanks for the review.

> 

> Do you have any thoughts on Yu-Cheng Yu's comments?  It would be nice to

> early-terminate the scan if we can, but my feeling so far was that the

> scan is cheap, the number of properties is unlikely to be more than a

> smallish integer, and the code separation benefits of just calling the

> arch code for every property probably likely outweigh the costs of

> having to iterate over every property.  We could always optimise it

> later if necessary.

> 

> I need to double-check that there's no way we can get stuck in an

> infinite loop with the current code, though I've not seen it in my

> testing.  I should throw some malformed notes at it though.


Here is my arch_parse_elf_property() and objdump of the property.
The parser works fine.

Thanks,
Yu-cheng




int arch_parse_elf_property(u32 type, const void *data, size_t datasz,
          
                   bool compat, struct arch_elf_state *state)
{
        if (type
!= GNU_PROPERTY_X86_FEATURE_1_AND)
                return -ENOENT;

        if (datasz < sizeof(unsigned int))
                return -ENOEXEC;

        state->gnu_property = *(unsigned int *)data;
        return 0;
}

Contents of section .note.gnu.property:
 400338 04000000 30000000 05000000 474e5500  ....0.......GNU.
 400348 020000c0 04000000 03000000 00000000  ................
 400358 000001c0 04000000 00000000 00000000  ................
 400368 010001c0 04000000 01000000 00000000  ................
Dave Martin Sept. 2, 2019, 9:28 a.m. UTC | #4
On Fri, Aug 30, 2019 at 06:03:27PM +0100, Yu-cheng Yu wrote:
> On Fri, 2019-08-30 at 09:34 +0100, Dave Martin wrote:

> > On Fri, Aug 30, 2019 at 06:37:45AM +0100, Kees Cook wrote:

> > > On Fri, Aug 23, 2019 at 06:23:40PM +0100, Dave Martin wrote:

> > > > ELF program properties will needed for detecting whether to enable

> > > > optional architecture or ABI features for a new ELF process.

> > > > 

> > > > For now, there are no generic properties that we care about, so do

> > > > nothing unless CONFIG_ARCH_USE_GNU_PROPERTY=y.

> > > > 

> > > > Otherwise, the presence of properties using the PT_PROGRAM_PROPERTY

> > > > phdrs entry (if any), and notify each property to the arch code.

> > > > 

> > > > For now, the added code is not used.

> > > > 

> > > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>

> > > 

> > > Reviewed-by: Kees Cook <keescook@chromium.org>

> > 

> > Thanks for the review.

> > 

> > Do you have any thoughts on Yu-Cheng Yu's comments?  It would be nice to

> > early-terminate the scan if we can, but my feeling so far was that the

> > scan is cheap, the number of properties is unlikely to be more than a

> > smallish integer, and the code separation benefits of just calling the

> > arch code for every property probably likely outweigh the costs of

> > having to iterate over every property.  We could always optimise it

> > later if necessary.

> > 

> > I need to double-check that there's no way we can get stuck in an

> > infinite loop with the current code, though I've not seen it in my

> > testing.  I should throw some malformed notes at it though.

> 

> Here is my arch_parse_elf_property() and objdump of the property.

> The parser works fine.


[...]

> int arch_parse_elf_property(u32 type, const void *data, size_t datasz,

>           

>                    bool compat, struct arch_elf_state *state)

> {

>         if (type

> != GNU_PROPERTY_X86_FEATURE_1_AND)

>                 return -ENOENT;


For error returns, I was following this convention:

	EIO: invalid ELF file

	ENOEXEC: valid ELF file, but we can't (or won't) support it

	0: OK, or don't care

This function gets called for every property, including properties that
the arch code may not be interested in, so for properties you don't care
about here you should return 0.

> 

>         if (datasz < sizeof(unsigned int))

>                 return -ENOEXEC;


Should this be != ?

According to the draft x86-64 psABI spec [1],
X86_PROPERTY_FEATURE_1_AND (and all properties based on
GNU_PROPERTY_X86_UINT32_AND_LO) has data consisting of a single 4-byte
unsigned integer.

>         state->gnu_property = *(unsigned int *)data;

>         return 0;

> }

>

> Contents of section .note.gnu.property:

>  400338 04000000 30000000 05000000 474e5500  ....0.......GNU.

>  400348 020000c0 04000000 03000000 00000000  ................

>  400358 000001c0 04000000 00000000 00000000  ................

>  400368 010001c0 04000000 01000000 00000000  ................


Because you return ENOENT except for GNU_PROPERTY_X86_FEATURE_1_AND,
wouldn't we fail on the second and third properties here?
(GNU_PROPERTY_X86_ISA_1_USED, GNU_PROPERTY_X86_FEATURE_2_USED if I
decoded them correctly...)

If not, maybe my code is failing to iterate over all the properties for
some reason.

Cheers
---Dave

[1] https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI
https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-draft.pdf
Yu-cheng Yu Sept. 3, 2019, 10:29 p.m. UTC | #5
On Mon, 2019-09-02 at 10:28 +0100, Dave Martin wrote:
> On Fri, Aug 30, 2019 at 06:03:27PM +0100, Yu-cheng Yu wrote:

> > On Fri, 2019-08-30 at 09:34 +0100, Dave Martin wrote:

> > > On Fri, Aug 30, 2019 at 06:37:45AM +0100, Kees Cook wrote:

> > > > On Fri, Aug 23, 2019 at 06:23:40PM +0100, Dave Martin wrote:

> > > > > ELF program properties will needed for detecting whether to enable

> > > > > optional architecture or ABI features for a new ELF process.

> > > > > 

> > > > > For now, there are no generic properties that we care about, so do

> > > > > nothing unless CONFIG_ARCH_USE_GNU_PROPERTY=y.

> > > > > 

> > > > > Otherwise, the presence of properties using the PT_PROGRAM_PROPERTY

> > > > > phdrs entry (if any), and notify each property to the arch code.

> > > > > 

> > > > > For now, the added code is not used.

> > > > > 

> > > > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>

> > > > 

> > > > Reviewed-by: Kees Cook <keescook@chromium.org>

> > > 

> > > Thanks for the review.

> > > 

> > > Do you have any thoughts on Yu-Cheng Yu's comments?  It would be nice to

> > > early-terminate the scan if we can, but my feeling so far was that the

> > > scan is cheap, the number of properties is unlikely to be more than a

> > > smallish integer, and the code separation benefits of just calling the

> > > arch code for every property probably likely outweigh the costs of

> > > having to iterate over every property.  We could always optimise it

> > > later if necessary.

> > > 

> > > I need to double-check that there's no way we can get stuck in an

> > > infinite loop with the current code, though I've not seen it in my

> > > testing.  I should throw some malformed notes at it though.

> > 

> > Here is my arch_parse_elf_property() and objdump of the property.

> > The parser works fine.

> 

> [...]

> 

> > int arch_parse_elf_property(u32 type, const void *data, size_t datasz,

> >           

> >                    bool compat, struct arch_elf_state *state)

> > {

> >         if (type

> > != GNU_PROPERTY_X86_FEATURE_1_AND)

> >                 return -ENOENT;

> 

> For error returns, I was following this convention:

> 

> 	EIO: invalid ELF file

> 

> 	ENOEXEC: valid ELF file, but we can't (or won't) support it

> 

> 	0: OK, or don't care


From errno-base.h, EIO is for I/O error; ENOEXEC is for Exec format error.
Is this closer to what is happening?

> 

> This function gets called for every property, including properties that

> the arch code may not be interested in, so for properties you don't care

> about here you should return 0.


Yes.

> 

> > 

> >         if (datasz < sizeof(unsigned int))

> >                 return -ENOEXEC;

> 

> Should this be != ?

> 

> According to the draft x86-64 psABI spec [1],

> X86_PROPERTY_FEATURE_1_AND (and all properties based on

> GNU_PROPERTY_X86_UINT32_AND_LO) has data consisting of a single 4-byte

> unsigned integer.

> 

> >         state->gnu_property = *(unsigned int *)data;

> >         return 0;

> > }


Yes, I will change it.

Thanks,
Yu-cheng
Dave Martin Sept. 4, 2019, 11:05 a.m. UTC | #6
[Kees, you have any thoughts on the error code issue?  See below.]

On Tue, Sep 03, 2019 at 11:29:10PM +0100, Yu-cheng Yu wrote:
> On Mon, 2019-09-02 at 10:28 +0100, Dave Martin wrote:

> > On Fri, Aug 30, 2019 at 06:03:27PM +0100, Yu-cheng Yu wrote:

> > > On Fri, 2019-08-30 at 09:34 +0100, Dave Martin wrote:

> > > > On Fri, Aug 30, 2019 at 06:37:45AM +0100, Kees Cook wrote:

> > > > > On Fri, Aug 23, 2019 at 06:23:40PM +0100, Dave Martin wrote:

> > > > > > ELF program properties will needed for detecting whether to enable

> > > > > > optional architecture or ABI features for a new ELF process.

> > > > > > 

> > > > > > For now, there are no generic properties that we care about, so do

> > > > > > nothing unless CONFIG_ARCH_USE_GNU_PROPERTY=y.

> > > > > > 

> > > > > > Otherwise, the presence of properties using the PT_PROGRAM_PROPERTY

> > > > > > phdrs entry (if any), and notify each property to the arch code.

> > > > > > 

> > > > > > For now, the added code is not used.

> > > > > > 

> > > > > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>

> > > > > 

> > > > > Reviewed-by: Kees Cook <keescook@chromium.org>

> > > > 

> > > > Thanks for the review.

> > > > 

> > > > Do you have any thoughts on Yu-Cheng Yu's comments?  It would be nice to

> > > > early-terminate the scan if we can, but my feeling so far was that the

> > > > scan is cheap, the number of properties is unlikely to be more than a

> > > > smallish integer, and the code separation benefits of just calling the

> > > > arch code for every property probably likely outweigh the costs of

> > > > having to iterate over every property.  We could always optimise it

> > > > later if necessary.

> > > > 

> > > > I need to double-check that there's no way we can get stuck in an

> > > > infinite loop with the current code, though I've not seen it in my

> > > > testing.  I should throw some malformed notes at it though.

> > > 

> > > Here is my arch_parse_elf_property() and objdump of the property.

> > > The parser works fine.

> > 

> > [...]

> > 

> > > int arch_parse_elf_property(u32 type, const void *data, size_t datasz,

> > >           

> > >                    bool compat, struct arch_elf_state *state)

> > > {

> > >         if (type

> > > != GNU_PROPERTY_X86_FEATURE_1_AND)

> > >                 return -ENOENT;

> > 

> > For error returns, I was following this convention:

> > 

> > 	EIO: invalid ELF file

> > 

> > 	ENOEXEC: valid ELF file, but we can't (or won't) support it

> > 

> > 	0: OK, or don't care

> 

> From errno-base.h, EIO is for I/O error; ENOEXEC is for Exec format error.

> Is this closer to what is happening?


The common case of ENOEXEC is that the file is valid but can't be
executed (i.e., wrong architecture, unsupported binary format etc.).

There's precent for reporting a truncated file (seen as a kernel_read()
that reads less data than requested) being reported as -EIO.  There is
no actual I/O error here, this is just a file with a noncompliant format.
This happens on short read when trying to read the interpreter filename
from the main ELF file for example.

Based on this, I used EIO to report malformed files.

This doesn't always happen though even in the existing code: for
example, the same error occurring in load_elf_phdrs() yields ENOEXEC
from execve(), not EIO.  It's not clear which (if either) is the
correct error code.

The behaviour isn't totally consistent though, so maybe this is not
intentional.

The distinction seemed useful, but I agree this can be seen as an abuse
of EIO.

I'm happy to change this if people don't like it, but I'm not sure what
to change it to...

> > This function gets called for every property, including properties that

> > the arch code may not be interested in, so for properties you don't care

> > about here you should return 0.

> 

> Yes.

> 

> > 

> > > 

> > >         if (datasz < sizeof(unsigned int))

> > >                 return -ENOEXEC;

> > 

> > Should this be != ?

> > 

> > According to the draft x86-64 psABI spec [1],

> > X86_PROPERTY_FEATURE_1_AND (and all properties based on

> > GNU_PROPERTY_X86_UINT32_AND_LO) has data consisting of a single 4-byte

> > unsigned integer.

> > 

> > >         state->gnu_property = *(unsigned int *)data;

> > >         return 0;

> > > }

> 

> Yes, I will change it.


OK

Cheers
---Dave
Kees Cook Sept. 4, 2019, 4:50 p.m. UTC | #7
On Fri, Aug 30, 2019 at 09:34:18AM +0100, Dave Martin wrote:
> Do you have any thoughts on Yu-Cheng Yu's comments?  It would be nice to

> early-terminate the scan if we can, but my feeling so far was that the

> scan is cheap, the number of properties is unlikely to be more than a

> smallish integer, and the code separation benefits of just calling the

> arch code for every property probably likely outweigh the costs of

> having to iterate over every property.  We could always optimise it

> later if necessary.


I feel like there are already a lot of ways to burn CPU time with
mangled ELF files, so this potential inefficiently doesn't seem bad to
me. If we want to add limits here, perhaps cap the property scan depth
(right now, IIRC, the count is u32, which is big...)

> I need to double-check that there's no way we can get stuck in an

> infinite loop with the current code, though I've not seen it in my

> testing.  I should throw some malformed notes at it though.


I think the cursor only performs forward movement, yes? I didn't see a
loop, but maybe there's something with the program headers that I
missed.

> Do you have any objection to merging patch 1 with this one?  For

> upstreaming purposes, it seems overkill for that to be a separate patch.


I don't _object_ to it, but I did like having it separate for review.

> Do you have any opinion on the WARN_ON()s?  They should be un-hittable,

> so they're documenting assumptions rather than protecting against

> anything real.  Maybe I should replace them with comments.


I think they're fine as self-documentation. My rule of thumb has been:

- don't use BUG*() unless you can defend it to Linus who wants 0 BUG()s.
- don't use WARN*() if userspace can reach it (and if you're not sure,
  use the WARN*ONCE() version)
- use pr_*_ratelimited() if unprivileged userspace can reach it.

-- 
Kees Cook
Dave Martin Sept. 5, 2019, 7:57 a.m. UTC | #8
On Wed, Sep 04, 2019 at 05:50:06PM +0100, Kees Cook wrote:
> On Fri, Aug 30, 2019 at 09:34:18AM +0100, Dave Martin wrote:

> > Do you have any thoughts on Yu-Cheng Yu's comments?  It would be nice to

> > early-terminate the scan if we can, but my feeling so far was that the

> > scan is cheap, the number of properties is unlikely to be more than a

> > smallish integer, and the code separation benefits of just calling the

> > arch code for every property probably likely outweigh the costs of

> > having to iterate over every property.  We could always optimise it

> > later if necessary.

> 

> I feel like there are already a lot of ways to burn CPU time with

> mangled ELF files, so this potential inefficiently doesn't seem bad to

> me. If we want to add limits here, perhaps cap the property scan depth

> (right now, IIRC, the count is u32, which is big...)


I was thinking more of valid ELF files where the number of properties is
large.

I feel that the GNU properties system will have become unfit for purpose
if the number of defined properties gets large enough for this to be an
issue though.

I'll keep this code as-is for now.

> > I need to double-check that there's no way we can get stuck in an

> > infinite loop with the current code, though I've not seen it in my

> > testing.  I should throw some malformed notes at it though.

> 

> I think the cursor only performs forward movement, yes? I didn't see a

> loop, but maybe there's something with the program headers that I

> missed.


That's the principle: always step forward, always by a non-zero amount.
So forward progress should be guaranteed...

> > Do you have any objection to merging patch 1 with this one?  For

> > upstreaming purposes, it seems overkill for that to be a separate patch.

> 

> I don't _object_ to it, but I did like having it separate for review.


I'm equally happy to leave them separate; I just wasn't sure how much
they made sense as separate patches.  I'll have a think when I respin.

> > Do you have any opinion on the WARN_ON()s?  They should be un-hittable,

> > so they're documenting assumptions rather than protecting against

> > anything real.  Maybe I should replace them with comments.

> 

> I think they're fine as self-documentation. My rule of thumb has been:

> 

> - don't use BUG*() unless you can defend it to Linus who wants 0 BUG()s.

> - don't use WARN*() if userspace can reach it (and if you're not sure,

>   use the WARN*ONCE() version)

> - use pr_*_ratelimited() if unprivileged userspace can reach it.


OK, I'll probably keep them for now, then.

This isn't a super-hot path, and with multiple kernel_read() calls in
the mix already it's hard to imagine the WARN_ON() calls being a
significant part of the overall cost.

Cheers
---Dave
Dave Martin Oct. 9, 2019, 12:59 p.m. UTC | #9
On Fri, Aug 30, 2019 at 09:34:17AM +0100, Dave Martin wrote:
> On Fri, Aug 30, 2019 at 06:37:45AM +0100, Kees Cook wrote:

> > On Fri, Aug 23, 2019 at 06:23:40PM +0100, Dave Martin wrote:

> > > ELF program properties will needed for detecting whether to enable

> > > optional architecture or ABI features for a new ELF process.

> > > 

> > > For now, there are no generic properties that we care about, so do

> > > nothing unless CONFIG_ARCH_USE_GNU_PROPERTY=y.

> > > 

> > > Otherwise, the presence of properties using the PT_PROGRAM_PROPERTY

> > > phdrs entry (if any), and notify each property to the arch code.

> > > 

> > > For now, the added code is not used.

> > > 

> > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>

> > 

> > Reviewed-by: Kees Cook <keescook@chromium.org>

> 

> Thanks for the review.

> 

> Do you have any thoughts on Yu-Cheng Yu's comments?  It would be nice to

> early-terminate the scan if we can, but my feeling so far was that the

> scan is cheap, the number of properties is unlikely to be more than a

> smallish integer, and the code separation benefits of just calling the

> arch code for every property probably likely outweigh the costs of

> having to iterate over every property.  We could always optimise it

> later if necessary.

> 

> I need to double-check that there's no way we can get stuck in an

> infinite loop with the current code, though I've not seen it in my

> testing.  I should throw some malformed notes at it though.

> 

> > Note below...

> > 

> > > [...]

> > > +static int parse_elf_property(const char *data, size_t *off, size_t datasz,

> > > +			      struct arch_elf_state *arch,

> > > +			      bool have_prev_type, u32 *prev_type)

> > > +{

> > > +	size_t size, step;

> > > +	const struct gnu_property *pr;

> > > +	int ret;

> > > +

> > > +	if (*off == datasz)

> > > +		return -ENOENT;

> > > +

> > > +	if (WARN_ON(*off > datasz || *off % elf_gnu_property_align))

> > > +		return -EIO;

> > > +

> > > +	size = datasz - *off;

> > > +	if (size < sizeof(*pr))

> > > +		return -EIO;

> > > +

> > > +	pr = (const struct gnu_property *)(data + *off);

> > > +	if (pr->pr_datasz > size - sizeof(*pr))

> > > +		return -EIO;

> > > +

> > > +	step = round_up(sizeof(*pr) + pr->pr_datasz, elf_gnu_property_align);

> > > +	if (step > size)

> > > +		return -EIO;

> > > +

> > > +	/* Properties are supposed to be unique and sorted on pr_type: */

> > > +	if (have_prev_type && pr->pr_type <= *prev_type)

> > > +		return -EIO;

> > > +	*prev_type = pr->pr_type;

> > > +

> > > +	ret = arch_parse_elf_property(pr->pr_type,

> > > +				      data + *off + sizeof(*pr),

> > > +				      pr->pr_datasz, ELF_COMPAT, arch);

> > 

> > I find it slightly hard to read the "cursor" motion in this parse. It

> > feels strange, for example, to refer twice to "data + *off" with the

> > second including consumed *pr size. Everything is fine AFAICT in the math,

> > though, and I haven't been able to construct a convincingly "cleaner"

> > version. Maybe:

> > 

> > 	data += *off;

> > 	pr = (const struct gnu_property *)data;

> > 	data += sizeof(*pr);

> > 	...

> > 	ret = arch_parse_elf_property(pr->pr_type, data,

> > 				      pr->pr_datasz, ELF_COMPAT, arch);

> 

> Fair point.  The cursor is really *off, which I suppose I could update

> as we go through this function, or cache in a local variable and assign

> on the way out.

> 

> > But that feels disjoint from the "step" calculation, so... I think what

> > you have is fine. :)

> 

> It's good to be as clear as possible about exactly how far we have

> parsed, so I'll see if I can improve this when I repost.

> 

> 

> Do you have any objection to merging patch 1 with this one?  For

> upstreaming purposes, it seems overkill for that to be a separate patch.

> 

> I may also convert elf_gnu_property_align to upper case, since unlike

> the other related definitions this one isn't trying to look like a

> struct tag.

> 

> Do you have any opinion on the WARN_ON()s?  They should be un-hittable,

> so they're documenting assumptions rather than protecting against

> anything real.  Maybe I should replace them with comments.


FYI, I'm going to be inactive for a while, so I'm not going to be able
to push this patch further.

Mark Brown will be picking up the arm64 BTI series, so it will probably
make sense if he pulls it into that series.

Any thoughts?

Cheers
---Dave
Kees Cook Oct. 10, 2019, 9 p.m. UTC | #10
On Wed, Oct 09, 2019 at 01:59:13PM +0100, Dave Martin wrote:
> On Fri, Aug 30, 2019 at 09:34:17AM +0100, Dave Martin wrote:

> > On Fri, Aug 30, 2019 at 06:37:45AM +0100, Kees Cook wrote:

> > > On Fri, Aug 23, 2019 at 06:23:40PM +0100, Dave Martin wrote:

> > > > ELF program properties will needed for detecting whether to enable

> > > > optional architecture or ABI features for a new ELF process.

> > > > 

> > > > For now, there are no generic properties that we care about, so do

> > > > nothing unless CONFIG_ARCH_USE_GNU_PROPERTY=y.

> > > > 

> > > > Otherwise, the presence of properties using the PT_PROGRAM_PROPERTY

> > > > phdrs entry (if any), and notify each property to the arch code.

> > > > 

> > > > For now, the added code is not used.

> > > > 

> > > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>

> > > 

> > > Reviewed-by: Kees Cook <keescook@chromium.org>

> > 

> > Thanks for the review.

> > 

> > Do you have any thoughts on Yu-Cheng Yu's comments?  It would be nice to

> > early-terminate the scan if we can, but my feeling so far was that the

> > scan is cheap, the number of properties is unlikely to be more than a

> > smallish integer, and the code separation benefits of just calling the

> > arch code for every property probably likely outweigh the costs of

> > having to iterate over every property.  We could always optimise it

> > later if necessary.

> > 

> > I need to double-check that there's no way we can get stuck in an

> > infinite loop with the current code, though I've not seen it in my

> > testing.  I should throw some malformed notes at it though.

> > 

> > > Note below...

> > > 

> > > > [...]

> > > > +static int parse_elf_property(const char *data, size_t *off, size_t datasz,

> > > > +			      struct arch_elf_state *arch,

> > > > +			      bool have_prev_type, u32 *prev_type)

> > > > +{

> > > > +	size_t size, step;

> > > > +	const struct gnu_property *pr;

> > > > +	int ret;

> > > > +

> > > > +	if (*off == datasz)

> > > > +		return -ENOENT;

> > > > +

> > > > +	if (WARN_ON(*off > datasz || *off % elf_gnu_property_align))

> > > > +		return -EIO;

> > > > +

> > > > +	size = datasz - *off;

> > > > +	if (size < sizeof(*pr))

> > > > +		return -EIO;

> > > > +

> > > > +	pr = (const struct gnu_property *)(data + *off);

> > > > +	if (pr->pr_datasz > size - sizeof(*pr))

> > > > +		return -EIO;

> > > > +

> > > > +	step = round_up(sizeof(*pr) + pr->pr_datasz, elf_gnu_property_align);

> > > > +	if (step > size)

> > > > +		return -EIO;

> > > > +

> > > > +	/* Properties are supposed to be unique and sorted on pr_type: */

> > > > +	if (have_prev_type && pr->pr_type <= *prev_type)

> > > > +		return -EIO;

> > > > +	*prev_type = pr->pr_type;

> > > > +

> > > > +	ret = arch_parse_elf_property(pr->pr_type,

> > > > +				      data + *off + sizeof(*pr),

> > > > +				      pr->pr_datasz, ELF_COMPAT, arch);

> > > 

> > > I find it slightly hard to read the "cursor" motion in this parse. It

> > > feels strange, for example, to refer twice to "data + *off" with the

> > > second including consumed *pr size. Everything is fine AFAICT in the math,

> > > though, and I haven't been able to construct a convincingly "cleaner"

> > > version. Maybe:

> > > 

> > > 	data += *off;

> > > 	pr = (const struct gnu_property *)data;

> > > 	data += sizeof(*pr);

> > > 	...

> > > 	ret = arch_parse_elf_property(pr->pr_type, data,

> > > 				      pr->pr_datasz, ELF_COMPAT, arch);

> > 

> > Fair point.  The cursor is really *off, which I suppose I could update

> > as we go through this function, or cache in a local variable and assign

> > on the way out.

> > 

> > > But that feels disjoint from the "step" calculation, so... I think what

> > > you have is fine. :)

> > 

> > It's good to be as clear as possible about exactly how far we have

> > parsed, so I'll see if I can improve this when I repost.

> > 

> > 

> > Do you have any objection to merging patch 1 with this one?  For

> > upstreaming purposes, it seems overkill for that to be a separate patch.

> > 

> > I may also convert elf_gnu_property_align to upper case, since unlike

> > the other related definitions this one isn't trying to look like a

> > struct tag.

> > 

> > Do you have any opinion on the WARN_ON()s?  They should be un-hittable,

> > so they're documenting assumptions rather than protecting against

> > anything real.  Maybe I should replace them with comments.

> 

> FYI, I'm going to be inactive for a while, so I'm not going to be able

> to push this patch further.

> 

> Mark Brown will be picking up the arm64 BTI series, so it will probably

> make sense if he pulls it into that series.

> 

> Any thoughts?


Okay, sounds good. Mark, I think these patches are in good shape. Can
you include me on CC where you pick these up?

Thanks!

-Kees

-- 
Kees Cook
diff mbox series

Patch

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index d4e11b2..d6541e8 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -39,12 +39,18 @@ 
 #include <linux/sched/coredump.h>
 #include <linux/sched/task_stack.h>
 #include <linux/sched/cputime.h>
+#include <linux/sizes.h>
+#include <linux/types.h>
 #include <linux/cred.h>
 #include <linux/dax.h>
 #include <linux/uaccess.h>
 #include <asm/param.h>
 #include <asm/page.h>
 
+#ifndef ELF_COMPAT
+#define ELF_COMPAT 0
+#endif
+
 #ifndef user_long_t
 #define user_long_t long
 #endif
@@ -690,6 +696,108 @@  static unsigned long randomize_stack_top(unsigned long stack_top)
 #endif
 }
 
+static int parse_elf_property(const char *data, size_t *off, size_t datasz,
+			      struct arch_elf_state *arch,
+			      bool have_prev_type, u32 *prev_type)
+{
+	size_t size, step;
+	const struct gnu_property *pr;
+	int ret;
+
+	if (*off == datasz)
+		return -ENOENT;
+
+	if (WARN_ON(*off > datasz || *off % elf_gnu_property_align))
+		return -EIO;
+
+	size = datasz - *off;
+	if (size < sizeof(*pr))
+		return -EIO;
+
+	pr = (const struct gnu_property *)(data + *off);
+	if (pr->pr_datasz > size - sizeof(*pr))
+		return -EIO;
+
+	step = round_up(sizeof(*pr) + pr->pr_datasz, elf_gnu_property_align);
+	if (step > size)
+		return -EIO;
+
+	/* Properties are supposed to be unique and sorted on pr_type: */
+	if (have_prev_type && pr->pr_type <= *prev_type)
+		return -EIO;
+	*prev_type = pr->pr_type;
+
+	ret = arch_parse_elf_property(pr->pr_type,
+				      data + *off + sizeof(*pr),
+				      pr->pr_datasz, ELF_COMPAT, arch);
+	if (ret)
+		return ret;
+
+	*off += step;
+	return 0;
+}
+
+#define NOTE_DATA_SZ SZ_1K
+#define GNU_PROPERTY_TYPE_0_NAME "GNU"
+#define NOTE_NAME_SZ (sizeof(GNU_PROPERTY_TYPE_0_NAME))
+
+static int parse_elf_properties(struct file *f, const struct elf_phdr *phdr,
+				struct arch_elf_state *arch)
+{
+	union {
+		struct elf_note nhdr;
+		char data[NOTE_DATA_SZ];
+	} note;
+	loff_t pos;
+	ssize_t n;
+	size_t off, datasz;
+	int ret;
+	bool have_prev_type;
+	u32 prev_type;
+
+	if (!IS_ENABLED(CONFIG_ARCH_USE_GNU_PROPERTY) || !phdr)
+		return 0;
+
+	/* load_elf_binary() shouldn't call us unless this is true... */
+	if (WARN_ON(phdr->p_type != PT_GNU_PROPERTY))
+		return -EIO;
+
+	/* If the properties are crazy large, that's too bad (for now): */
+	if (phdr->p_filesz > sizeof(note))
+		return -ENOEXEC;
+
+	pos = phdr->p_offset;
+	n = kernel_read(f, &note, phdr->p_filesz, &pos);
+
+	BUILD_BUG_ON(sizeof(note) < sizeof(note.nhdr) + NOTE_NAME_SZ);
+	if (n < 0 || n < sizeof(note.nhdr) + NOTE_NAME_SZ)
+		return -EIO;
+
+	if (note.nhdr.n_type != NT_GNU_PROPERTY_TYPE_0 ||
+	    note.nhdr.n_namesz != NOTE_NAME_SZ ||
+	    strncmp(note.data + sizeof(note.nhdr),
+		    GNU_PROPERTY_TYPE_0_NAME, n - sizeof(note.nhdr)))
+		return -EIO;
+
+	off = round_up(sizeof(note.nhdr) + NOTE_NAME_SZ,
+		       elf_gnu_property_align);
+	if (off > n)
+		return -EIO;
+
+	if (note.nhdr.n_descsz > n - off)
+		return -EIO;
+	datasz = off + note.nhdr.n_descsz;
+
+	have_prev_type = false;
+	do {
+		ret = parse_elf_property(note.data, &off, datasz, arch,
+					 have_prev_type, &prev_type);
+		have_prev_type = true;
+	} while (!ret);
+
+	return ret == -ENOENT ? 0 : ret;
+}
+
 static int load_elf_binary(struct linux_binprm *bprm)
 {
 	struct file *interpreter = NULL; /* to shut gcc up */
@@ -697,6 +805,7 @@  static int load_elf_binary(struct linux_binprm *bprm)
 	int load_addr_set = 0;
 	unsigned long error;
 	struct elf_phdr *elf_ppnt, *elf_phdata, *interp_elf_phdata = NULL;
+	struct elf_phdr *elf_property_phdata = NULL;
 	unsigned long elf_bss, elf_brk;
 	int bss_prot = 0;
 	int retval, i;
@@ -744,6 +853,11 @@  static int load_elf_binary(struct linux_binprm *bprm)
 		char *elf_interpreter;
 		loff_t pos;
 
+		if (elf_ppnt->p_type == PT_GNU_PROPERTY) {
+			elf_property_phdata = elf_ppnt;
+			continue;
+		}
+
 		if (elf_ppnt->p_type != PT_INTERP)
 			continue;
 
@@ -839,9 +953,14 @@  static int load_elf_binary(struct linux_binprm *bprm)
 			goto out_free_dentry;
 
 		/* Pass PT_LOPROC..PT_HIPROC headers to arch code */
+		elf_property_phdata = NULL;
 		elf_ppnt = interp_elf_phdata;
 		for (i = 0; i < loc->interp_elf_ex.e_phnum; i++, elf_ppnt++)
 			switch (elf_ppnt->p_type) {
+			case PT_GNU_PROPERTY:
+				elf_property_phdata = elf_ppnt;
+				break;
+
 			case PT_LOPROC ... PT_HIPROC:
 				retval = arch_elf_pt_proc(&loc->interp_elf_ex,
 							  elf_ppnt, interpreter,
@@ -852,6 +971,11 @@  static int load_elf_binary(struct linux_binprm *bprm)
 			}
 	}
 
+	retval = parse_elf_properties(interpreter ?: bprm->file,
+				      elf_property_phdata, &arch_state);
+	if (retval)
+		goto out_free_dentry;
+
 	/*
 	 * Allow arch code to reject the ELF at this point, whilst it's
 	 * still possible to return an error to the code that invoked
diff --git a/fs/compat_binfmt_elf.c b/fs/compat_binfmt_elf.c
index b7f9ffa..f9ee476 100644
--- a/fs/compat_binfmt_elf.c
+++ b/fs/compat_binfmt_elf.c
@@ -17,6 +17,8 @@ 
 #include <linux/elfcore-compat.h>
 #include <linux/time.h>
 
+#define ELF_COMPAT	1
+
 /*
  * Rename the basic ELF layout types to refer to the 32-bit class of files.
  */
@@ -28,11 +30,13 @@ 
 #undef	elf_shdr
 #undef	elf_note
 #undef	elf_addr_t
+#undef	elf_gnu_property_align
 #define elfhdr		elf32_hdr
 #define elf_phdr	elf32_phdr
 #define elf_shdr	elf32_shdr
 #define elf_note	elf32_note
 #define elf_addr_t	Elf32_Addr
+#define elf_gnu_property_align	elf32_gnu_property_align
 
 /*
  * Some data types as stored in coredump.
diff --git a/include/linux/elf.h b/include/linux/elf.h
index 4485499..d9779c0 100644
--- a/include/linux/elf.h
+++ b/include/linux/elf.h
@@ -22,6 +22,9 @@ 
 	SET_PERSONALITY(ex)
 #endif
 
+#define elf32_gnu_property_align	4
+#define elf64_gnu_property_align	8
+
 #if ELF_CLASS == ELFCLASS32
 
 extern Elf32_Dyn _DYNAMIC [];
@@ -32,6 +35,7 @@  extern Elf32_Dyn _DYNAMIC [];
 #define elf_addr_t	Elf32_Off
 #define Elf_Half	Elf32_Half
 #define Elf_Word	Elf32_Word
+#define elf_gnu_property_align	elf32_gnu_property_align
 
 #else
 
@@ -43,6 +47,7 @@  extern Elf64_Dyn _DYNAMIC [];
 #define elf_addr_t	Elf64_Off
 #define Elf_Half	Elf64_Half
 #define Elf_Word	Elf64_Word
+#define elf_gnu_property_align	elf64_gnu_property_align
 
 #endif
 
@@ -64,4 +69,18 @@  struct gnu_property {
   __u32 pr_datasz;
 };
 
+struct arch_elf_state;
+
+#ifndef CONFIG_ARCH_USE_GNU_PROPERTY
+static inline int arch_parse_elf_property(u32 type, const void *data,
+					  size_t datasz, bool compat,
+					  struct arch_elf_state *arch)
+{
+	return 0;
+}
+#else
+extern int arch_parse_elf_property(u32 type, const void *data, size_t datasz,
+				   bool compat, struct arch_elf_state *arch);
+#endif
+
 #endif /* _LINUX_ELF_H */
diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
index c377314..20900f4 100644
--- a/include/uapi/linux/elf.h
+++ b/include/uapi/linux/elf.h
@@ -368,6 +368,7 @@  typedef struct elf64_shdr {
  * Notes used in ET_CORE. Architectures export some of the arch register sets
  * using the corresponding note types via the PTRACE_GETREGSET and
  * PTRACE_SETREGSET requests.
+ * The note name for all these is "LINUX".
  */
 #define NT_PRSTATUS	1
 #define NT_PRFPREG	2
@@ -430,6 +431,9 @@  typedef struct elf64_shdr {
 #define NT_MIPS_FP_MODE	0x801		/* MIPS floating-point mode */
 #define NT_MIPS_MSA	0x802		/* MIPS SIMD registers */
 
+/* Note types with note name "GNU" */
+#define NT_GNU_PROPERTY_TYPE_0	5
+
 /* Note header in a PT_NOTE section */
 typedef struct elf32_note {
   Elf32_Word	n_namesz;	/* Name size */