Message ID | 5f7378fda41cb0f60ce1a0c59fb3717d04556d45.1519428075.git.crobinso@redhat.com |
---|---|
State | New |
Headers | show |
Series | util: xml: Don't conflict with other libxml2 user callbacks | expand |
On Fri, Feb 23, 2018 at 06:21:15PM -0500, Cole Robinson wrote: >lxml is a popular python XML processing library. It uses libxml2 >behind the scenes, and registers custom callbacks via >xmlSetExternalEntityLoader. However this can cause crashes if >if an app uses both lxml and libxml2 together in the same process. > >This is a known limitation of lxml and libxml2 generally. It also >prevents us from using lxml in virt-manager: > >https://bugzilla.redhat.com/show_bug.cgi?id=1544019 > >However it's easy enough to work around in libvirt, by unsetting the >EntityLoader callback to a known state before we ask libxml2 to >parse a file from disk. > >Signed-off-by: Cole Robinson <crobinso@redhat.com> >--- > src/util/virxml.c | 5 +++++ > 1 file changed, 5 insertions(+) > >diff --git a/src/util/virxml.c b/src/util/virxml.c >index 6e87605ea..3e01794f9 100644 >--- a/src/util/virxml.c >+++ b/src/util/virxml.c >@@ -810,9 +810,14 @@ virXMLParseHelper(int domcode, > pctxt->sax->error = catchXMLError; > > if (filename) { >+ /* Reset any libxml2 file callbacks, other libs (like python lxml) >+ * may have set their own which can get crashy */ >+ xmlExternalEntityLoader origloader = xmlGetExternalEntityLoader(); >+ xmlSetExternalEntityLoader(xmlNoNetExternalEntityLoader); > xml = xmlCtxtReadFile(pctxt, filename, NULL, > XML_PARSE_NONET | > XML_PARSE_NOWARNING); >+ xmlSetExternalEntityLoader(origloader); This does not look thread-safe at all - what if two libvirt threads try to parse some XML at the same time? Jan > } else { > xml = xmlCtxtReadDoc(pctxt, BAD_CAST xmlStr, url, NULL, > XML_PARSE_NONET | >-- >2.14.3 > >-- >libvir-list mailing list >libvir-list@redhat.com >https://www.redhat.com/mailman/listinfo/libvir-list -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Mon, Feb 26, 2018 at 09:48:39AM +0100, Ján Tomko wrote: > On Fri, Feb 23, 2018 at 06:21:15PM -0500, Cole Robinson wrote: > > lxml is a popular python XML processing library. It uses libxml2 > > behind the scenes, and registers custom callbacks via > > xmlSetExternalEntityLoader. However this can cause crashes if > > if an app uses both lxml and libxml2 together in the same process. > > > > This is a known limitation of lxml and libxml2 generally. It also > > prevents us from using lxml in virt-manager: > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1544019 > > > > However it's easy enough to work around in libvirt, by unsetting the > > EntityLoader callback to a known state before we ask libxml2 to > > parse a file from disk. > > > > Signed-off-by: Cole Robinson <crobinso@redhat.com> > > --- > > src/util/virxml.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/src/util/virxml.c b/src/util/virxml.c > > index 6e87605ea..3e01794f9 100644 > > --- a/src/util/virxml.c > > +++ b/src/util/virxml.c > > @@ -810,9 +810,14 @@ virXMLParseHelper(int domcode, > > pctxt->sax->error = catchXMLError; > > > > if (filename) { > > + /* Reset any libxml2 file callbacks, other libs (like python lxml) > > + * may have set their own which can get crashy */ > > + xmlExternalEntityLoader origloader = xmlGetExternalEntityLoader(); > > + xmlSetExternalEntityLoader(xmlNoNetExternalEntityLoader); > > xml = xmlCtxtReadFile(pctxt, filename, NULL, > > XML_PARSE_NONET | > > XML_PARSE_NOWARNING); > > + xmlSetExternalEntityLoader(origloader); > > This does not look thread-safe at all - what if two libvirt threads > try to parse some XML at the same time? Indeed, that is not thread safe - I checked the libxml code and it just sets a static variable, not thread local. Regards, Daniel
diff --git a/src/util/virxml.c b/src/util/virxml.c index 6e87605ea..3e01794f9 100644 --- a/src/util/virxml.c +++ b/src/util/virxml.c @@ -810,9 +810,14 @@ virXMLParseHelper(int domcode, pctxt->sax->error = catchXMLError; if (filename) { + /* Reset any libxml2 file callbacks, other libs (like python lxml) + * may have set their own which can get crashy */ + xmlExternalEntityLoader origloader = xmlGetExternalEntityLoader(); + xmlSetExternalEntityLoader(xmlNoNetExternalEntityLoader); xml = xmlCtxtReadFile(pctxt, filename, NULL, XML_PARSE_NONET | XML_PARSE_NOWARNING); + xmlSetExternalEntityLoader(origloader); } else { xml = xmlCtxtReadDoc(pctxt, BAD_CAST xmlStr, url, NULL, XML_PARSE_NONET |
lxml is a popular python XML processing library. It uses libxml2 behind the scenes, and registers custom callbacks via xmlSetExternalEntityLoader. However this can cause crashes if if an app uses both lxml and libxml2 together in the same process. This is a known limitation of lxml and libxml2 generally. It also prevents us from using lxml in virt-manager: https://bugzilla.redhat.com/show_bug.cgi?id=1544019 However it's easy enough to work around in libvirt, by unsetting the EntityLoader callback to a known state before we ask libxml2 to parse a file from disk. Signed-off-by: Cole Robinson <crobinso@redhat.com> --- src/util/virxml.c | 5 +++++ 1 file changed, 5 insertions(+) -- 2.14.3 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list