Message ID | 20230530141635.136968-9-dhowells@redhat.com |
---|---|
State | New |
Headers | show |
Series | crypto, splice, net: Make AF_ALG handle sendmsg(MSG_SPLICE_PAGES) | expand |
On Tue, 2023-05-30 at 15:16 +0100, David Howells wrote: > Make AF_ALG sendmsg() support MSG_SPLICE_PAGES. This causes pages to be > spliced from the source iterator. > > This allows ->sendpage() to be replaced by something that can handle > multiple multipage folios in a single transaction. > > Signed-off-by: David Howells <dhowells@redhat.com> > cc: Herbert Xu <herbert@gondor.apana.org.au> > cc: "David S. Miller" <davem@davemloft.net> > cc: Eric Dumazet <edumazet@google.com> > cc: Jakub Kicinski <kuba@kernel.org> > cc: Paolo Abeni <pabeni@redhat.com> > cc: Jens Axboe <axboe@kernel.dk> > cc: Matthew Wilcox <willy@infradead.org> > cc: linux-crypto@vger.kernel.org > cc: netdev@vger.kernel.org > --- > crypto/af_alg.c | 28 ++++++++++++++++++++++++++-- > crypto/algif_aead.c | 22 +++++++++++----------- > crypto/algif_skcipher.c | 8 ++++---- > 3 files changed, 41 insertions(+), 17 deletions(-) > > diff --git a/crypto/af_alg.c b/crypto/af_alg.c > index fd56ccff6fed..62f4205d42e3 100644 > --- a/crypto/af_alg.c > +++ b/crypto/af_alg.c > @@ -940,6 +940,10 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, > bool init = false; > int err = 0; > > + if ((msg->msg_flags & MSG_SPLICE_PAGES) && > + !iov_iter_is_bvec(&msg->msg_iter)) > + return -EINVAL; > + > if (msg->msg_controllen) { > err = af_alg_cmsg_send(msg, &con); > if (err) > @@ -985,7 +989,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, > while (size) { > struct scatterlist *sg; > size_t len = size; > - size_t plen; > + ssize_t plen; > > /* use the existing memory in an allocated page */ > if (ctx->merge) { > @@ -1030,7 +1034,27 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, > if (sgl->cur) > sg_unmark_end(sg + sgl->cur - 1); > > - if (1 /* TODO check MSG_SPLICE_PAGES */) { > + if (msg->msg_flags & MSG_SPLICE_PAGES) { > + struct sg_table sgtable = { > + .sgl = sg, > + .nents = sgl->cur, > + .orig_nents = sgl->cur, > + }; > + > + plen = extract_iter_to_sg(&msg->msg_iter, len, &sgtable, > + MAX_SGL_ENTS, 0); It looks like the above expect/supports only ITER_BVEC iterators, what about adding a WARN_ON_ONCE(<other iov type>)? Also, I'm keeping this series a bit more in pw to allow Herbert or others to have a look. Cheers, Paolo
Paolo Abeni <pabeni@redhat.com> wrote: > > + if ((msg->msg_flags & MSG_SPLICE_PAGES) && > > + !iov_iter_is_bvec(&msg->msg_iter)) > > + return -EINVAL; > > + > ... > It looks like the above expect/supports only ITER_BVEC iterators, what > about adding a WARN_ON_ONCE(<other iov type>)? Meh. I relaxed that requirement as I'm now using tools to extract stuff from any iterator (extract_iter_to_sg() in this case) rather than walking the bvec[] directly. I forgot to remove the check from af_alg. I can add an extra patch to remove it. Also, it probably doesn't matter for AF_ALG since that's only likely to be called from userspace, either directly (which will not set MSG_SPLICE_PAGES) or via splice (which will pass a BVEC). Internal kernel code will use crypto API directly. > Also, I'm keeping this series a bit more in pw to allow Herbert or > others to have a look. Thanks. David
On Thu, 2023-06-01 at 12:35 +0100, David Howells wrote: > Paolo Abeni <pabeni@redhat.com> wrote: > > > > + if ((msg->msg_flags & MSG_SPLICE_PAGES) && > > > + !iov_iter_is_bvec(&msg->msg_iter)) > > > + return -EINVAL; > > > + > > ... > > It looks like the above expect/supports only ITER_BVEC iterators, what > > about adding a WARN_ON_ONCE(<other iov type>)? > > Meh. I relaxed that requirement as I'm now using tools to extract stuff from > any iterator (extract_iter_to_sg() in this case) rather than walking the > bvec[] directly. I forgot to remove the check from af_alg. I can add an > extra patch to remove it. Also, it probably doesn't matter for AF_ALG since > that's only likely to be called from userspace, either directly (which will > not set MSG_SPLICE_PAGES) or via splice (which will pass a BVEC). Internal > kernel code will use crypto API directly. Thank you for the clarification, I got lost a bit. The patch LGTM as is. > > > Also, I'm keeping this series a bit more in pw to allow Herbert or > > others to have a look. @Herbert, the series LGTM, I think we should apply it. If you have any concerns, please voice them soon! Thanks, Paolo
diff --git a/crypto/af_alg.c b/crypto/af_alg.c index fd56ccff6fed..62f4205d42e3 100644 --- a/crypto/af_alg.c +++ b/crypto/af_alg.c @@ -940,6 +940,10 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, bool init = false; int err = 0; + if ((msg->msg_flags & MSG_SPLICE_PAGES) && + !iov_iter_is_bvec(&msg->msg_iter)) + return -EINVAL; + if (msg->msg_controllen) { err = af_alg_cmsg_send(msg, &con); if (err) @@ -985,7 +989,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, while (size) { struct scatterlist *sg; size_t len = size; - size_t plen; + ssize_t plen; /* use the existing memory in an allocated page */ if (ctx->merge) { @@ -1030,7 +1034,27 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size, if (sgl->cur) sg_unmark_end(sg + sgl->cur - 1); - if (1 /* TODO check MSG_SPLICE_PAGES */) { + if (msg->msg_flags & MSG_SPLICE_PAGES) { + struct sg_table sgtable = { + .sgl = sg, + .nents = sgl->cur, + .orig_nents = sgl->cur, + }; + + plen = extract_iter_to_sg(&msg->msg_iter, len, &sgtable, + MAX_SGL_ENTS, 0); + if (plen < 0) { + err = plen; + goto unlock; + } + + for (; sgl->cur < sgtable.nents; sgl->cur++) + get_page(sg_page(&sg[sgl->cur])); + len -= plen; + ctx->used += plen; + copied += plen; + size -= plen; + } else { do { struct page *pg; unsigned int i = sgl->cur; diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c index 829878025dba..35bfa283748d 100644 --- a/crypto/algif_aead.c +++ b/crypto/algif_aead.c @@ -9,8 +9,8 @@ * The following concept of the memory management is used: * * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is - * filled by user space with the data submitted via sendpage/sendmsg. Filling - * up the TX SGL does not cause a crypto operation -- the data will only be + * filled by user space with the data submitted via sendpage. Filling up + * the TX SGL does not cause a crypto operation -- the data will only be * tracked by the kernel. Upon receipt of one recvmsg call, the caller must * provide a buffer which is tracked with the RX SGL. * @@ -113,19 +113,19 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, } /* - * Data length provided by caller via sendmsg/sendpage that has not - * yet been processed. + * Data length provided by caller via sendmsg that has not yet been + * processed. */ used = ctx->used; /* - * Make sure sufficient data is present -- note, the same check is - * also present in sendmsg/sendpage. The checks in sendpage/sendmsg - * shall provide an information to the data sender that something is - * wrong, but they are irrelevant to maintain the kernel integrity. - * We need this check here too in case user space decides to not honor - * the error message in sendmsg/sendpage and still call recvmsg. This - * check here protects the kernel integrity. + * Make sure sufficient data is present -- note, the same check is also + * present in sendmsg. The checks in sendmsg shall provide an + * information to the data sender that something is wrong, but they are + * irrelevant to maintain the kernel integrity. We need this check + * here too in case user space decides to not honor the error message + * in sendmsg and still call recvmsg. This check here protects the + * kernel integrity. */ if (!aead_sufficient_data(sk)) return -EINVAL; diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c index a251cd6bd5b9..b1f321b9f846 100644 --- a/crypto/algif_skcipher.c +++ b/crypto/algif_skcipher.c @@ -9,10 +9,10 @@ * The following concept of the memory management is used: * * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is - * filled by user space with the data submitted via sendpage/sendmsg. Filling - * up the TX SGL does not cause a crypto operation -- the data will only be - * tracked by the kernel. Upon receipt of one recvmsg call, the caller must - * provide a buffer which is tracked with the RX SGL. + * filled by user space with the data submitted via sendmsg. Filling up the TX + * SGL does not cause a crypto operation -- the data will only be tracked by + * the kernel. Upon receipt of one recvmsg call, the caller must provide a + * buffer which is tracked with the RX SGL. * * During the processing of the recvmsg operation, the cipher request is * allocated and prepared. As part of the recvmsg operation, the processed
Make AF_ALG sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells <dhowells@redhat.com> cc: Herbert Xu <herbert@gondor.apana.org.au> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> cc: linux-crypto@vger.kernel.org cc: netdev@vger.kernel.org --- crypto/af_alg.c | 28 ++++++++++++++++++++++++++-- crypto/algif_aead.c | 22 +++++++++++----------- crypto/algif_skcipher.c | 8 ++++---- 3 files changed, 41 insertions(+), 17 deletions(-)