Query XML with Namespaces in Java Using XPath
When working with XML documents containing elements bound to namespaces, querying with XPath can become challenging. Consider the following example:
Using a default "no namespace" XPath expression like "/workbook/sheets/sheet[1]" will fail. This is because the elements in the XML document are bound to a namespace, which is not considered in the XPath expression.
Solution 1: Register Namespace and Use Namespace Prefix
The recommended approach is to register the namespace with a namespace prefix, making XPath easier to read and maintain:
NamespaceContext namespaceContext = new NamespaceContext() { @Override public String getNamespaceURI(String prefix) { if (prefix.equals("main")) { return "http://schemas.openxmlformats.org/spreadsheetml/2006/main"; } else if (prefix.equals("r")) { return "http://schemas.openxmlformats.org/officeDocument/2006/relationships"; } return null; } }; XPathFactory xPathFactory = XPathFactory.newInstance(); XPath xPath = xPathFactory.newXPath(); xPath.setNamespaceContext(namespaceContext); NodeList nodes = (NodeList) xPath.evaluate("/main:workbook/main:sheets/main:sheet[1]", document, XPathConstants.NODESET);
Using the namespace prefix, the XPath expression becomes "/main:workbook/main:sheets/main:sheet[1]", which correctly addresses elements bound to the specified namespace.
Solution 2: Generic Match and Predicate Filter
Alternatively, an XPath expression without a namespace prefix can be constructed by using a generic match for the element and a predicate filter that specifies the desired local-name() and namespace-uri():
XPathFactory xPathFactory = XPathFactory.newInstance(); XPath xPath = xPathFactory.newXPath(); NodeList nodes = (NodeList) xPath.evaluate("/*[local-name()='workbook' and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main']" "/*[local-name()='sheets' and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main']" "/*[local-name()='sheet' and namespace-uri()='http://schemas.openxmlformats.org/spreadsheetml/2006/main'][1]", document, XPathConstants.NODESET);
This method is verbose and can present risks if mixed vocabularies are present in the XML document.
Conclusion
When dealing with XML documents with namespaces, it is essential to consider namespace bindings. By registering the namespace with a prefix or carefully crafting generic XPath expressions, accurate and reliable queries can be performed.
Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.
Copyright© 2022 湘ICP备2022001581号-3