diff mbox series

lib/lzo: Avoid output overruns when compressing

Message ID Z7rGXJSX57gEfXPw@gondor.apana.org.au
State New
Headers show
Series lib/lzo: Avoid output overruns when compressing | expand

Commit Message

Herbert Xu Feb. 23, 2025, 6:55 a.m. UTC
The compression code in LZO never checked for output overruns.
Fix this by checking for end of buffer before each write.

Fixes: 64c70b1cf43d ("Add LZO1X algorithm to the kernel")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Comments

David Sterba Feb. 26, 2025, 1 p.m. UTC | #1
On Sun, Feb 23, 2025 at 02:55:24PM +0800, Herbert Xu wrote:
> The compression code in LZO never checked for output overruns.
> Fix this by checking for end of buffer before each write.

Does it have to check for the overruns? The worst case compression
result size is known and can be calculated by the formula. Using big
enough buffer is part of the correct usage of LZO. All in-kernel users
of lzo1x_1_compress() seem to provide the target buffer calculated by
lzo1x_worst_compress(). F2FS, JFFS2, BTRFS. Not sure about ZRAM.

What strikes me as alarming that you insert about 20 branches into a
realtime compression algorithm, where everything is basically a hot
path.  Branches that almost never happen, and never if the output buffer
is big enough.

Please drop the patch.
Herbert Xu Feb. 27, 2025, 1:47 a.m. UTC | #2
On Sun, Feb 23, 2025 at 05:40:31PM +0900, Sergey Senozhatsky wrote:
>
> A bit of a pity that NEED_OP() with "goto output_overrun" is only
> for decompression.  It looks equally relevant to the compression
> path.  I'm not insisting on doing something similar in compression
> tho.

I thought HAVE_OP looked nicer but I'll switch to NEED_OP for
the sake of consistency.

Thanks,
David Sterba Feb. 27, 2025, 3:16 a.m. UTC | #3
On Thu, Feb 27, 2025 at 09:46:10AM +0800, Herbert Xu wrote:
> On Wed, Feb 26, 2025 at 02:00:37PM +0100, David Sterba wrote:
> >
> > Does it have to check for the overruns? The worst case compression
> > result size is known and can be calculated by the formula. Using big
> 
> If the caller is using different algorithms, then yes the checks
> are essential.  Otherwise the caller would have to allocate enough
> memory not just for LZO, but for the worst-case compression length
> for *any* algorithm.  Adding a single algorithm would have the
> potential of breaking all users.
>  
> > What strikes me as alarming that you insert about 20 branches into a
> > realtime compression algorithm, where everything is basically a hot
> > path.  Branches that almost never happen, and never if the output buffer
> > is big enough.
> 
> OK, if that is a real concern then I will add a _safe version of
> LZO compression alongside the existing code.

Makes sense, thanks. The in-kernel users are OK, but the crypto API also
exports the compression so there's no guarantee it's used correctly. As
it needs changes to the LZO code itself I don't see a better way than to
have 2 versions, conveniently done by the macros as yo did.
Ard Biesheuvel Feb. 28, 2025, 12:43 p.m. UTC | #4
On Fri, 28 Feb 2025 at 06:24, Sergey Senozhatsky
<senozhatsky@chromium.org> wrote:
>
> On (25/02/26 14:00), David Sterba wrote:
> > What strikes me as alarming that you insert about 20 branches into a
> > realtime compression algorithm, where everything is basically a hot
> > path.  Branches that almost never happen, and never if the output buffer
> > is big enough.
> >
> > Please drop the patch.
>
> David, just for educational purposes, there's only safe variant of lzo
> decompression, which seems to be doing a lot of NEED_OP (HAVE_OP) adding
> branches and so on, basically what Herbert is adding to the compression
> path.  So my question is - why NEED_OP (if (!HAVE_OP(x)) goto output_overrun)
> is a no go for compression, but appears to be fine for decompression?
>

Because compression has a bounded worst case (compressing data with
LZO can actually increase the size but only by a limited amount),
whereas decompressing a small input could produce gigabytes of output.
Sergey Senozhatsky Feb. 28, 2025, 1:55 p.m. UTC | #5
On (25/02/28 13:43), Ard Biesheuvel wrote:
> On Fri, 28 Feb 2025 at 06:24, Sergey Senozhatsky
> <senozhatsky@chromium.org> wrote:
> >
> > On (25/02/26 14:00), David Sterba wrote:
> > > What strikes me as alarming that you insert about 20 branches into a
> > > realtime compression algorithm, where everything is basically a hot
> > > path.  Branches that almost never happen, and never if the output buffer
> > > is big enough.
> > >
> > > Please drop the patch.
> >
> > David, just for educational purposes, there's only safe variant of lzo
> > decompression, which seems to be doing a lot of NEED_OP (HAVE_OP) adding
> > branches and so on, basically what Herbert is adding to the compression
> > path.  So my question is - why NEED_OP (if (!HAVE_OP(x)) goto output_overrun)
> > is a no go for compression, but appears to be fine for decompression?
> >
> 
> Because compression has a bounded worst case (compressing data with
> LZO can actually increase the size but only by a limited amount),
> whereas decompressing a small input could produce gigabytes of output.

One can argue that sometimes we know the upper bound.  E.g. zswap
and zram always compress physical pages, so decompression cannot
result in a bigger (than the original physical page) data.
diff mbox series

Patch

diff --git a/lib/lzo/lzo1x_compress.c b/lib/lzo/lzo1x_compress.c
index 47d6d43ea957..5d2f2f851694 100644
--- a/lib/lzo/lzo1x_compress.c
+++ b/lib/lzo/lzo1x_compress.c
@@ -18,10 +18,10 @@ 
 #include <linux/lzo.h>
 #include "lzodefs.h"
 
-static noinline size_t
+static noinline int
 lzo1x_1_do_compress(const unsigned char *in, size_t in_len,
-		    unsigned char *out, size_t *out_len,
-		    size_t ti, void *wrkmem, signed char *state_offset,
+		    unsigned char **out, unsigned char *op_end,
+		    size_t *tp, void *wrkmem, signed char *state_offset,
 		    const unsigned char bitstream_version)
 {
 	const unsigned char *ip;
@@ -30,8 +30,9 @@  lzo1x_1_do_compress(const unsigned char *in, size_t in_len,
 	const unsigned char * const ip_end = in + in_len - 20;
 	const unsigned char *ii;
 	lzo_dict_t * const dict = (lzo_dict_t *) wrkmem;
+	size_t ti = *tp;
 
-	op = out;
+	op = *out;
 	ip = in;
 	ii = ip;
 	ip += ti < 4 ? 4 - ti : 0;
@@ -116,25 +117,41 @@  lzo1x_1_do_compress(const unsigned char *in, size_t in_len,
 		if (t != 0) {
 			if (t <= 3) {
 				op[*state_offset] |= t;
+				if (!HAVE_OP(4))
+					return LZO_E_OUTPUT_OVERRUN;
 				COPY4(op, ii);
 				op += t;
 			} else if (t <= 16) {
+				if (!HAVE_OP(1))
+					return LZO_E_OUTPUT_OVERRUN;
 				*op++ = (t - 3);
+				if (!HAVE_OP(16))
+					return LZO_E_OUTPUT_OVERRUN;
 				COPY8(op, ii);
 				COPY8(op + 8, ii + 8);
 				op += t;
 			} else {
 				if (t <= 18) {
+					if (!HAVE_OP(1))
+						return LZO_E_OUTPUT_OVERRUN;
 					*op++ = (t - 3);
 				} else {
 					size_t tt = t - 18;
+					if (!HAVE_OP(1))
+						return LZO_E_OUTPUT_OVERRUN;
 					*op++ = 0;
 					while (unlikely(tt > 255)) {
 						tt -= 255;
+						if (!HAVE_OP(1))
+							return LZO_E_OUTPUT_OVERRUN;
 						*op++ = 0;
 					}
+					if (!HAVE_OP(1))
+						return LZO_E_OUTPUT_OVERRUN;
 					*op++ = tt;
 				}
+				if (!HAVE_OP(t))
+					return LZO_E_OUTPUT_OVERRUN;
 				do {
 					COPY8(op, ii);
 					COPY8(op + 8, ii + 8);
@@ -151,6 +168,8 @@  lzo1x_1_do_compress(const unsigned char *in, size_t in_len,
 		if (unlikely(run_length)) {
 			ip += run_length;
 			run_length -= MIN_ZERO_RUN_LENGTH;
+			if (!HAVE_OP(4))
+				return LZO_E_OUTPUT_OVERRUN;
 			put_unaligned_le32((run_length << 21) | 0xfffc18
 					   | (run_length & 0x7), op);
 			op += 4;
@@ -243,10 +262,14 @@  lzo1x_1_do_compress(const unsigned char *in, size_t in_len,
 		ip += m_len;
 		if (m_len <= M2_MAX_LEN && m_off <= M2_MAX_OFFSET) {
 			m_off -= 1;
+			if (!HAVE_OP(2))
+				return LZO_E_OUTPUT_OVERRUN;
 			*op++ = (((m_len - 1) << 5) | ((m_off & 7) << 2));
 			*op++ = (m_off >> 3);
 		} else if (m_off <= M3_MAX_OFFSET) {
 			m_off -= 1;
+			if (!HAVE_OP(1))
+				return LZO_E_OUTPUT_OVERRUN;
 			if (m_len <= M3_MAX_LEN)
 				*op++ = (M3_MARKER | (m_len - 2));
 			else {
@@ -254,14 +277,22 @@  lzo1x_1_do_compress(const unsigned char *in, size_t in_len,
 				*op++ = M3_MARKER | 0;
 				while (unlikely(m_len > 255)) {
 					m_len -= 255;
+					if (!HAVE_OP(1))
+						return LZO_E_OUTPUT_OVERRUN;
 					*op++ = 0;
 				}
+				if (!HAVE_OP(1))
+					return LZO_E_OUTPUT_OVERRUN;
 				*op++ = (m_len);
 			}
+			if (!HAVE_OP(2))
+				return LZO_E_OUTPUT_OVERRUN;
 			*op++ = (m_off << 2);
 			*op++ = (m_off >> 6);
 		} else {
 			m_off -= 0x4000;
+			if (!HAVE_OP(1))
+				return LZO_E_OUTPUT_OVERRUN;
 			if (m_len <= M4_MAX_LEN)
 				*op++ = (M4_MARKER | ((m_off >> 11) & 8)
 						| (m_len - 2));
@@ -282,11 +313,17 @@  lzo1x_1_do_compress(const unsigned char *in, size_t in_len,
 				m_len -= M4_MAX_LEN;
 				*op++ = (M4_MARKER | ((m_off >> 11) & 8));
 				while (unlikely(m_len > 255)) {
+					if (!HAVE_OP(1))
+						return LZO_E_OUTPUT_OVERRUN;
 					m_len -= 255;
 					*op++ = 0;
 				}
+				if (!HAVE_OP(1))
+					return LZO_E_OUTPUT_OVERRUN;
 				*op++ = (m_len);
 			}
+			if (!HAVE_OP(2))
+				return LZO_E_OUTPUT_OVERRUN;
 			*op++ = (m_off << 2);
 			*op++ = (m_off >> 6);
 		}
@@ -295,14 +332,16 @@  lzo1x_1_do_compress(const unsigned char *in, size_t in_len,
 		ii = ip;
 		goto next;
 	}
-	*out_len = op - out;
-	return in_end - (ii - ti);
+	*out = op;
+	*tp = in_end - (ii - ti);
+	return LZO_E_OK;
 }
 
 static int lzogeneric1x_1_compress(const unsigned char *in, size_t in_len,
 		     unsigned char *out, size_t *out_len,
 		     void *wrkmem, const unsigned char bitstream_version)
 {
+	unsigned char * const op_end = out + *out_len;
 	const unsigned char *ip = in;
 	unsigned char *op = out;
 	unsigned char *data_start;
@@ -326,14 +365,17 @@  static int lzogeneric1x_1_compress(const unsigned char *in, size_t in_len,
 	while (l > 20) {
 		size_t ll = min_t(size_t, l, m4_max_offset + 1);
 		uintptr_t ll_end = (uintptr_t) ip + ll;
+		int err;
+
 		if ((ll_end + ((t + ll) >> 5)) <= ll_end)
 			break;
 		BUILD_BUG_ON(D_SIZE * sizeof(lzo_dict_t) > LZO1X_1_MEM_COMPRESS);
 		memset(wrkmem, 0, D_SIZE * sizeof(lzo_dict_t));
-		t = lzo1x_1_do_compress(ip, ll, op, out_len, t, wrkmem,
-					&state_offset, bitstream_version);
+		err = lzo1x_1_do_compress(ip, ll, &op, op_end, &t, wrkmem,
+					  &state_offset, bitstream_version);
+		if (err != LZO_E_OK)
+			return err;
 		ip += ll;
-		op += *out_len;
 		l  -= ll;
 	}
 	t += l;
@@ -342,20 +384,32 @@  static int lzogeneric1x_1_compress(const unsigned char *in, size_t in_len,
 		const unsigned char *ii = in + in_len - t;
 
 		if (op == data_start && t <= 238) {
+			if (!HAVE_OP(1))
+				return LZO_E_OUTPUT_OVERRUN;
 			*op++ = (17 + t);
 		} else if (t <= 3) {
 			op[state_offset] |= t;
 		} else if (t <= 18) {
+			if (!HAVE_OP(1))
+				return LZO_E_OUTPUT_OVERRUN;
 			*op++ = (t - 3);
 		} else {
 			size_t tt = t - 18;
+			if (!HAVE_OP(1))
+				return LZO_E_OUTPUT_OVERRUN;
 			*op++ = 0;
 			while (tt > 255) {
 				tt -= 255;
+				if (!HAVE_OP(1))
+					return LZO_E_OUTPUT_OVERRUN;
 				*op++ = 0;
 			}
+			if (!HAVE_OP(1))
+				return LZO_E_OUTPUT_OVERRUN;
 			*op++ = tt;
 		}
+		if (!HAVE_OP(t))
+			return LZO_E_OUTPUT_OVERRUN;
 		if (t >= 16) do {
 			COPY8(op, ii);
 			COPY8(op + 8, ii + 8);
@@ -368,6 +422,8 @@  static int lzogeneric1x_1_compress(const unsigned char *in, size_t in_len,
 		} while (--t > 0);
 	}
 
+	if (!HAVE_OP(3))
+		return LZO_E_OUTPUT_OVERRUN;
 	*op++ = M4_MARKER | 1;
 	*op++ = 0;
 	*op++ = 0;
diff --git a/lib/lzo/lzo1x_decompress_safe.c b/lib/lzo/lzo1x_decompress_safe.c
index c94f4928e188..4d5a1b58a4a0 100644
--- a/lib/lzo/lzo1x_decompress_safe.c
+++ b/lib/lzo/lzo1x_decompress_safe.c
@@ -21,7 +21,6 @@ 
 #include "lzodefs.h"
 
 #define HAVE_IP(x)      ((size_t)(ip_end - ip) >= (size_t)(x))
-#define HAVE_OP(x)      ((size_t)(op_end - op) >= (size_t)(x))
 #define NEED_IP(x)      if (!HAVE_IP(x)) goto input_overrun
 #define NEED_OP(x)      if (!HAVE_OP(x)) goto output_overrun
 #define TEST_LB(m_pos)  if ((m_pos) < out) goto lookbehind_overrun
diff --git a/lib/lzo/lzodefs.h b/lib/lzo/lzodefs.h
index b60851fcf6ce..8b1a46993acf 100644
--- a/lib/lzo/lzodefs.h
+++ b/lib/lzo/lzodefs.h
@@ -19,6 +19,7 @@ 
  */
 #define LZO_VERSION 1
 
+#define HAVE_OP(x)      ((size_t)(op_end - op) >= (size_t)(x))
 #define COPY4(dst, src)	\
 		put_unaligned(get_unaligned((const u32 *)(src)), (u32 *)(dst))
 #if defined(CONFIG_X86_64) || defined(CONFIG_ARM64)