Message ID | f00c0c0afa080c60f016df1471158c1caf983c34.1708007371.git.sd@queasysnail.net |
---|---|
State | Accepted |
Commit | ec823bf3a479d42c589dc0f28ef4951c49cd2d2a |
Headers | show |
Series | tls: fixes for record type handling with PEEK | expand |
On Thu, 15 Feb 2024 17:17:31 +0100 Sabrina Dubroca wrote: > @@ -1772,7 +1772,8 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, > u8 *control, > size_t skip, > size_t len, > - bool is_peek) > + bool is_peek, > + bool *more) > { > struct sk_buff *skb = skb_peek(&ctx->rx_list); > struct tls_msg *tlm; > @@ -1844,6 +1845,10 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, > > out: > return copied ? : err; > +more: > + if (more) > + *more = true; > + goto out; Patches look correct, one small nit here - I don't have great ideas how to avoid the 7th argument completely but I think it'd be a little cleaner if we either: - passed in err as an output argument (some datagram code does that IIRC), then function can always return copied directly, or - passed copied as an output argument, and then we can always return err? I like the former a little better because we won't have to special case NULL for the "after async decryption" call sites.
2024-02-19, 12:07:03 -0800, Jakub Kicinski wrote: > On Thu, 15 Feb 2024 17:17:31 +0100 Sabrina Dubroca wrote: > > @@ -1772,7 +1772,8 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, > > u8 *control, > > size_t skip, > > size_t len, > > - bool is_peek) > > + bool is_peek, > > + bool *more) > > { > > struct sk_buff *skb = skb_peek(&ctx->rx_list); > > struct tls_msg *tlm; > > > > @@ -1844,6 +1845,10 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, > > > > out: > > return copied ? : err; > > +more: > > + if (more) > > + *more = true; > > + goto out; > > Patches look correct, one small nit here - > > I don't have great ideas how to avoid the 7th argument completely but I hesitated between this patch and a variant combining is_peek and more into a single u8 *flags, but that felt a bit messy (or does that fall into what you describe as "not [having] great ideas"? :)) @@ -1772,9 +1777,10 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, u8 *control, size_t skip, size_t len, - bool is_peek) + u8 *flags) { struct sk_buff *skb = skb_peek(&ctx->rx_list); + bool is_peek = *flags & RXLIST_PEEK; struct tls_msg *tlm; ssize_t copied = 0; int err; [...] @@ -1844,6 +1850,9 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, out: return copied ? : err; +more: + *flags |= RXLIST_MORE; + goto out; } and then in tls_sw_recvmsg: u8 rxlist_flags = is_peek ? RXLIST_PEEK : 0; err = process_rx_list(ctx, msg, &control, 0, len, &rxlist_flags); > I think it'd be a little cleaner if we either: > - passed in err as an output argument (some datagram code does that > IIRC), then function can always return copied directly, or (yes, __skb_wait_for_more_packets, __skb_try_recv_datagram, and their variants) > - passed copied as an output argument, and then we can always return > err? Aren't those 2 options adding an 8th argument? I tend to find ">= 0 on success, otherwise errno" more readable, probably because that's a very common pattern (either for recvmsg style of cases, or all the ERR_PTR type situations). > I like the former a little better because we won't have to special case > NULL for the "after async decryption" call sites. We could also pass &rx_more every time and not check for NULL. What do you want to clean up more specifically? The number of arguments, the backwards goto, the NULL check before setting *more, something else/all of the above?
On Tue, 20 Feb 2024 00:10:58 +0100 Sabrina Dubroca wrote: > 2024-02-19, 12:07:03 -0800, Jakub Kicinski wrote: > > On Thu, 15 Feb 2024 17:17:31 +0100 Sabrina Dubroca wrote: > > > @@ -1772,7 +1772,8 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, > > > u8 *control, > > > size_t skip, > > > size_t len, > > > - bool is_peek) > > > + bool is_peek, > > > + bool *more) > > > { > > > struct sk_buff *skb = skb_peek(&ctx->rx_list); > > > struct tls_msg *tlm; > > > > > @@ -1844,6 +1845,10 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, > > > > > > out: > > > return copied ? : err; > > > +more: > > > + if (more) > > > + *more = true; > > > + goto out; > > > > Patches look correct, one small nit here - > > > > I don't have great ideas how to avoid the 7th argument completely but > > I hesitated between this patch and a variant combining is_peek and > more into a single u8 *flags, but that felt a bit messy (or does that > fall into what you describe as "not [having] great ideas"? :)) I guess it saves a register, it seems a bit better but then it's a truly in/out argument :) > > I think it'd be a little cleaner if we either: > > - passed in err as an output argument (some datagram code does that > > IIRC), then function can always return copied directly, or > > (yes, __skb_wait_for_more_packets, __skb_try_recv_datagram, and their > variants) > > > - passed copied as an output argument, and then we can always return > > err? > > Aren't those 2 options adding an 8th argument? No, no, still 7, if we separate copied from err - checking err < 0 is enough to know that we need to exit. Differently put, perhaps, my preference is to pass an existing entity (err or copied), rather that conjure new concept (more) on one end and interpret it on the other. > I tend to find ">= 0 on success, otherwise errno" more readable, > probably because that's a very common pattern (either for recvmsg > style of cases, or all the ERR_PTR type situations). Right it definitely is a good pattern. I think passing copied via argument would give us those semantics still? > > I like the former a little better because we won't have to special case > > NULL for the "after async decryption" call sites. > > We could also pass &rx_more every time and not check for NULL. > > What do you want to clean up more specifically? The number of > arguments, the backwards goto, the NULL check before setting *more, > something else/all of the above? Not compiled, but what I had in mind was something along the lines of: diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 9fbc70200cd0..6e6e6d89b173 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1772,7 +1772,8 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, u8 *control, size_t skip, size_t len, - bool is_peek) + bool is_peek, + int *out_copied) { struct sk_buff *skb = skb_peek(&ctx->rx_list); struct tls_msg *tlm; @@ -1843,7 +1844,8 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, err = 0; out: - return copied ? : err; + *out_copied = copied; + return err; } static bool @@ -1966,11 +1968,10 @@ int tls_sw_recvmsg(struct sock *sk, goto end; /* Process pending decrypted records. It must be non-zero-copy */ - err = process_rx_list(ctx, msg, &control, 0, len, is_peek); + err = process_rx_list(ctx, msg, &control, 0, len, is_peek, &copied); if (err < 0) goto end; - copied = err; if (len <= copied) goto end; @@ -2128,10 +2129,10 @@ int tls_sw_recvmsg(struct sock *sk, /* Drain records from the rx_list & copy if required */ if (is_peek || is_kvec) err = process_rx_list(ctx, msg, &control, copied, - decrypted, is_peek); + decrypted, is_peek, &ret); else err = process_rx_list(ctx, msg, &control, 0, - async_copy_bytes, is_peek); + async_copy_bytes, is_peek, &ret); } copied += decrypted;
2024-02-20, 17:50:53 -0800, Jakub Kicinski wrote: > On Tue, 20 Feb 2024 00:10:58 +0100 Sabrina Dubroca wrote: > > 2024-02-19, 12:07:03 -0800, Jakub Kicinski wrote: > > > On Thu, 15 Feb 2024 17:17:31 +0100 Sabrina Dubroca wrote: > > > > @@ -1772,7 +1772,8 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, > > > > u8 *control, > > > > size_t skip, > > > > size_t len, > > > > - bool is_peek) > > > > + bool is_peek, > > > > + bool *more) > > > > { > > > > struct sk_buff *skb = skb_peek(&ctx->rx_list); > > > > struct tls_msg *tlm; > > > > > > > @@ -1844,6 +1845,10 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, > > > > > > > > out: > > > > return copied ? : err; > > > > +more: > > > > + if (more) > > > > + *more = true; > > > > + goto out; > > > > > > Patches look correct, one small nit here - > > > > > > I don't have great ideas how to avoid the 7th argument completely but > > > > I hesitated between this patch and a variant combining is_peek and > > more into a single u8 *flags, but that felt a bit messy (or does that > > fall into what you describe as "not [having] great ideas"? :)) > > I guess it saves a register, it seems a bit better but then it's a > truly in/out argument :) We already do that with darg all over the receive code, so it shouldn't be too confusing to readers. It can be named flags_inout if you think that would help, or have a comment like above tls_decrypt_sg. > > > I think it'd be a little cleaner if we either: > > > - passed in err as an output argument (some datagram code does that > > > IIRC), then function can always return copied directly, or > > > > (yes, __skb_wait_for_more_packets, __skb_try_recv_datagram, and their > > variants) > > > > > - passed copied as an output argument, and then we can always return > > > err? > > > > Aren't those 2 options adding an 8th argument? > > No, no, still 7, if we separate copied from err - checking err < 0 > is enough to know that we need to exit. Right, I realized that you probably meant something like that as I was going to bed last night. It's not exactly enough, since tls_record_content_type will return 0 on a content type mismatch. We'll have to translate that into an "error". I think it would be a bit nicer to set err=1 and then check err != 0 in tls_sw_recvmsg (we can document that in a comment above process_rx_list) rather than making up a fake errno. See diff [1]. Or we could swap the 0/1 returns from tls_record_content_type and switch the err <= 0 tests to err != 0 after the existing calls, then process_rx_list doesn't have a weird special case [2]. What do you think? > Differently put, perhaps, my preference is to pass an existing entity > (err or copied), rather that conjure new concept (more) on one end and > interpret it on the other. > > > I tend to find ">= 0 on success, otherwise errno" more readable, > > probably because that's a very common pattern (either for recvmsg > > style of cases, or all the ERR_PTR type situations). > > Right it definitely is a good pattern. I think passing copied via > argument would give us those semantics still? For recvmsg sure, but not for process_rx_list. > > > I like the former a little better because we won't have to special case > > > NULL for the "after async decryption" call sites. > > > > We could also pass &rx_more every time and not check for NULL. > > > > What do you want to clean up more specifically? The number of > > arguments, the backwards goto, the NULL check before setting *more, > > something else/all of the above? > > Not compiled, but what I had in mind was something along the lines of: copied is a ssize_t (but ret isn't), so the change gets a bit uglier :( ------------ 8< ------------ [1] fix by setting err=1 in process_rx_list diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 43dd0d82b6ed..711504614da7 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1766,13 +1766,19 @@ static void tls_rx_rec_done(struct tls_sw_context_rx *ctx) * decrypted records into the buffer provided by caller zero copy is not * true. Further, the records are removed from the rx_list if it is not a peek * case and the record has been consumed completely. + * + * Return: + * - 0 if len bytes were copied + * - 1 if < len bytes were copied due to a record type mismatch + * - <0 if an error occurred */ static int process_rx_list(struct tls_sw_context_rx *ctx, struct msghdr *msg, u8 *control, size_t skip, size_t len, - bool is_peek) + bool is_peek, + ssize_t *out_copied) { struct sk_buff *skb = skb_peek(&ctx->rx_list); struct tls_msg *tlm; @@ -1802,8 +1808,11 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, tlm = tls_msg(skb); err = tls_record_content_type(msg, tlm, control); - if (err <= 0) + if (err <= 0) { + if (err == 0) + err = 1; goto out; + } err = skb_copy_datagram_msg(skb, rxm->offset + skip, msg, chunk); @@ -1843,7 +1852,8 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, err = 0; out: - return copied ? : err; + *out_copied = copied; + return err; } static bool @@ -1966,11 +1976,10 @@ int tls_sw_recvmsg(struct sock *sk, goto end; /* Process pending decrypted records. It must be non-zero-copy */ - err = process_rx_list(ctx, msg, &control, 0, len, is_peek); - if (err < 0) + err = process_rx_list(ctx, msg, &control, 0, len, is_peek, &copied); + if (err != 0) goto end; - copied = err; if (len <= copied || (copied && control != TLS_RECORD_TYPE_DATA)) goto end; @@ -2114,6 +2123,7 @@ int tls_sw_recvmsg(struct sock *sk, recv_end: if (async) { + ssize_t ret2; int ret; /* Wait for all previously submitted records to be decrypted */ @@ -2130,10 +2140,10 @@ int tls_sw_recvmsg(struct sock *sk, /* Drain records from the rx_list & copy if required */ if (is_peek || is_kvec) err = process_rx_list(ctx, msg, &control, copied, - decrypted, is_peek); + decrypted, is_peek, &ret2); else err = process_rx_list(ctx, msg, &control, 0, - async_copy_bytes, is_peek); + async_copy_bytes, is_peek, &ret2); } copied += decrypted; ------------ 8< ------------ [2] fixing the bug by changing tls_record_content_type as well diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 43dd0d82b6ed..3da62ba97945 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1734,6 +1734,11 @@ int decrypt_skb(struct sock *sk, struct scatterlist *sgout) return tls_decrypt_sg(sk, NULL, sgout, &darg); } +/* Return: + * - 0 on success + * - 1 if the record's type doesn't match the value in control + * - <0 if an error occurred + */ static int tls_record_content_type(struct msghdr *msg, struct tls_msg *tlm, u8 *control) { @@ -1751,10 +1756,10 @@ static int tls_record_content_type(struct msghdr *msg, struct tls_msg *tlm, return -EIO; } } else if (*control != tlm->control) { - return 0; + return 1; } - return 1; + return 0; } static void tls_rx_rec_done(struct tls_sw_context_rx *ctx) @@ -1766,13 +1771,19 @@ static void tls_rx_rec_done(struct tls_sw_context_rx *ctx) * decrypted records into the buffer provided by caller zero copy is not * true. Further, the records are removed from the rx_list if it is not a peek * case and the record has been consumed completely. + * + * Return: + * - 0 if len bytes were copied + * - 1 if < len bytes were copied due to a record type mismatch + * - <0 if an error occurred */ static int process_rx_list(struct tls_sw_context_rx *ctx, struct msghdr *msg, u8 *control, size_t skip, size_t len, - bool is_peek) + bool is_peek, + ssize_t *out_copied) { struct sk_buff *skb = skb_peek(&ctx->rx_list); struct tls_msg *tlm; @@ -1784,7 +1795,7 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, tlm = tls_msg(skb); err = tls_record_content_type(msg, tlm, control); - if (err <= 0) + if (err != 0) goto out; if (skip < rxm->full_len) @@ -1802,7 +1813,7 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, tlm = tls_msg(skb); err = tls_record_content_type(msg, tlm, control); - if (err <= 0) + if (err != 0) goto out; err = skb_copy_datagram_msg(skb, rxm->offset + skip, @@ -1843,7 +1854,8 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, err = 0; out: - return copied ? : err; + *out_copied = copied; + return err; } static bool @@ -1966,11 +1978,10 @@ int tls_sw_recvmsg(struct sock *sk, goto end; /* Process pending decrypted records. It must be non-zero-copy */ - err = process_rx_list(ctx, msg, &control, 0, len, is_peek); - if (err < 0) + err = process_rx_list(ctx, msg, &control, 0, len, is_peek, &copied); + if (err != 0) goto end; - copied = err; if (len <= copied || (copied && control != TLS_RECORD_TYPE_DATA)) goto end; @@ -2032,7 +2043,7 @@ int tls_sw_recvmsg(struct sock *sk, * For tls1.3, we disable async. */ err = tls_record_content_type(msg, tls_msg(darg.skb), &control); - if (err <= 0) { + if (err != 0) { DEBUG_NET_WARN_ON_ONCE(darg.zc); tls_rx_rec_done(ctx); put_on_rx_list_err: @@ -2114,6 +2125,7 @@ int tls_sw_recvmsg(struct sock *sk, recv_end: if (async) { + ssize_t ret2; int ret; /* Wait for all previously submitted records to be decrypted */ @@ -2130,10 +2142,10 @@ int tls_sw_recvmsg(struct sock *sk, /* Drain records from the rx_list & copy if required */ if (is_peek || is_kvec) err = process_rx_list(ctx, msg, &control, copied, - decrypted, is_peek); + decrypted, is_peek, &ret2); else err = process_rx_list(ctx, msg, &control, 0, - async_copy_bytes, is_peek); + async_copy_bytes, is_peek, &ret2); } copied += decrypted;
On Wed, 21 Feb 2024 14:59:40 +0100 Sabrina Dubroca wrote: > It's not exactly enough, since tls_record_content_type will return 0 > on a content type mismatch. We'll have to translate that into an > "error". Ugh, that's unpleasant. > I think it would be a bit nicer to set err=1 and then check > err != 0 in tls_sw_recvmsg (we can document that in a comment above > process_rx_list) rather than making up a fake errno. See diff [1]. > > Or we could swap the 0/1 returns from tls_record_content_type and > switch the err <= 0 tests to err != 0 after the existing calls, then > process_rx_list doesn't have a weird special case [2]. > > What do you think? I missed the error = 1 case, sorry. No strong preference, then. Checking for error = 1 will be as special as the new rx_more flag. Should I apply this version as is, then?
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 43dd0d82b6ed..de96959336c4 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1772,7 +1772,8 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, u8 *control, size_t skip, size_t len, - bool is_peek) + bool is_peek, + bool *more) { struct sk_buff *skb = skb_peek(&ctx->rx_list); struct tls_msg *tlm; @@ -1785,7 +1786,7 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, err = tls_record_content_type(msg, tlm, control); if (err <= 0) - goto out; + goto more; if (skip < rxm->full_len) break; @@ -1803,12 +1804,12 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, err = tls_record_content_type(msg, tlm, control); if (err <= 0) - goto out; + goto more; err = skb_copy_datagram_msg(skb, rxm->offset + skip, msg, chunk); if (err < 0) - goto out; + goto more; len = len - chunk; copied = copied + chunk; @@ -1844,6 +1845,10 @@ static int process_rx_list(struct tls_sw_context_rx *ctx, out: return copied ? : err; +more: + if (more) + *more = true; + goto out; } static bool @@ -1947,6 +1952,7 @@ int tls_sw_recvmsg(struct sock *sk, int target, err; bool is_kvec = iov_iter_is_kvec(&msg->msg_iter); bool is_peek = flags & MSG_PEEK; + bool rx_more = false; bool released = true; bool bpf_strp_enabled; bool zc_capable; @@ -1966,12 +1972,12 @@ int tls_sw_recvmsg(struct sock *sk, goto end; /* Process pending decrypted records. It must be non-zero-copy */ - err = process_rx_list(ctx, msg, &control, 0, len, is_peek); + err = process_rx_list(ctx, msg, &control, 0, len, is_peek, &rx_more); if (err < 0) goto end; copied = err; - if (len <= copied || (copied && control != TLS_RECORD_TYPE_DATA)) + if (len <= copied || (copied && control != TLS_RECORD_TYPE_DATA) || rx_more) goto end; target = sock_rcvlowat(sk, flags & MSG_WAITALL, len); @@ -2130,10 +2136,10 @@ int tls_sw_recvmsg(struct sock *sk, /* Drain records from the rx_list & copy if required */ if (is_peek || is_kvec) err = process_rx_list(ctx, msg, &control, copied, - decrypted, is_peek); + decrypted, is_peek, NULL); else err = process_rx_list(ctx, msg, &control, 0, - async_copy_bytes, is_peek); + async_copy_bytes, is_peek, NULL); } copied += decrypted;
If we queue 3 records: - record 1, type DATA - record 2, some other type - record 3, type DATA and do a recv(PEEK), the rx_list will contain the first two records. The next large recv will walk through the rx_list and copy data from record 1, then stop because record 2 is a different type. Since we haven't filled up our buffer, we will process the next available record. It's also DATA, so we can merge it with the current read. We shouldn't do that, since there was a record in between that we ignored. Add a flag to let process_rx_list inform tls_sw_recvmsg that it had more data available. Fixes: 692d7b5d1f91 ("tls: Fix recvmsg() to be able to peek across multiple records") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> --- net/tls/tls_sw.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-)