Java : Parsing xml file using SAX/XPATH

Dax Amin

I have a xml file, mentioned below:

<?xml version="1.0" encoding="UTF-8"?>
<Workbook>
    <ExcelWorkbook
    xmlns="urn:schemas-microsoft-com:office:excel"/>
        <Worksheet ss:Name="Table 1">
            <Table>
                <Row ss:Index="7" ss:AutoFitHeight="0" ss:Height="12">
                <Cell ss:Index="1" ss:StyleID="s05">
                    <ss:Data ss:Type="String"
                        xmlns="http://www.w3.org/TR/REC-html40">
                        <Font html:Size="9" html:Face="Times New Roman" x:Family="Roman" html:Color="#000000">
                        ABCD
                        </Font>
                    </ss:Data>
                </Cell>
            </Row>

How do I extract the data, "ABCD" here, using SAX or XPATH in Java?

EDIT 1:

This is the XML-

<Table>
<Row ss:Index="74" ss:AutoFitHeight="0" ss:Height="14">
    <Cell ss:Index="1" ss:MergeAcross="3" ss:StyleID="s29">
        <ss:Data ss:Type="Number" xmlns="http://www.w3.org/TR/REC-html40">
        0.00
        </ss:Data>
    </Cell>
    <Cell ss:Index="15" ss:MergeAcross="5" ss:StyleID="s29">
        <ss:Data ss:Type="Number" xmlns="http://www.w3.org/TR/REC-html40">
        4.57
        </ss:Data>
    </Cell>
</Row>
Sharon Ben Asher

The solution assumes that the question is how to get the text for any cell based on row and column numbers.

It took me a while to get the solution because of the use of namespaces in the input document. apparently, xpath cannot parse qualified elements and attributes without a namespace processor and one hsa to implement an interface for this purpose (there is no default?) so I found a map based implementation here and used it.

So, assuming you have the class from the link in your source tree, the following code works. I broke the search pattern to several variables for the sake of clarity

public static String getCellValue(String filename, int rowIdx, int colIdx) {
    // search for Table element anywhere in the source
    String tableElementPattern = "//*[name()='Table']";
    // search for Row element with given number
    String rowPattern = String.format("/*[name()='Row' and @ss:Index='%d']", rowIdx) ;
    // search for Cell element with given column number
    String cellPattern = String.format("/*[name()='Cell' and @ss:Index='%d']", colIdx) ;  
    // search for element that has ss:Type="String" attribute, search for element with text under it and get text name
    String cellStringContent = "/*[@ss:Type='String']/*[text()]/text()";  
    String completePattern = tableElementPattern + rowPattern + cellPattern + cellStringContent;

    try (FileReader reader = new FileReader(filename)) {
        XPath xPath = getXpathProcessor();
        Node n = (Node)xPath.compile(completePattern)
        .evaluate(new InputSource(reader), XPathConstants.NODE);
        if (n.getNodeType() == Node.TEXT_NODE) {
            return n.getNodeValue().trim();
        }
    } catch (Exception e) {
        e.printStackTrace();
    }
    return null;
}

private static XPath getXpathProcessor() {
    // this is where the custom implementation of NamespaceContext is used
    NamespaceContext context = new NamespaceContextMap(
        "html", "http://www.w3.org/TR/REC-html40", 
        "xsl", "http://www.w3.org/1999/XSL/Transform",
        "o", "urn:schemas-microsoft-com:office:office",
        "x", "urn:schemas-microsoft-com:office:excel",
        "ss", "urn:schemas-microsoft-com:office:spreadsheet");
    XPath xpath =  XPathFactory.newInstance().newXPath();
    xpath.setNamespaceContext(context);
    return xpath;
}

calling:

System.out.println(getCellValue("C://Temp/xx.xml", 7, 1));

produces the desired output

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Parsing XML file In java using DOM

From Dev

Parsing nested elements in an XML file using Java

From Dev

Parsing XML file using a for loop

From Dev

Xml conditional parsing using java

From Dev

parsing an xml file using xml module in python

From Dev

Java XML Parsing of complex xml file

From Dev

Xml parsing and writing to another xml file in Java

From Dev

XML file not parsing using NSXMLParser in iOS

From Dev

Parsing huge XML file using Go

From Dev

Parsing XML file using C#

From Dev

Parsing Complexed XML file using Coldfusion

From Dev

Parsing a complicated XML file with PHP using DOM

From Dev

Parsing an XML file using C#

From Dev

Parsing XML file in R using rentrez

From Dev

Parsing part of xml file in Java(Android)

From Dev

Error when parsing XML file in Java

From Dev

Parsing and updating xml using SAX parser in java

From Dev

Java Parsing XML multilevel Using Xpath

From Dev

Java Parsing iTunes XML library using XPath

From Dev

Error parsing XML using certain schema with Java

From Dev

Java Parsing iTunes XML library using XPath

From Dev

parsing xml using reflection api in java

From Dev

Parsing HTTP XML Response Using Regex In Java

From Dev

how to find the node which has child node while parsing XML file using DOM parser in java

From Dev

how to find the node which has child node while parsing XML file using DOM parser in java

From Dev

Parsing XML file to count tag occurrence using XML::Simple

From Dev

Parsing XML file containing HTML entities in Java without changing the XML

From Dev

Java XML: parsing nested XML file with identical tags

From Dev

Parsing a JSON file in Java using GSON

Related Related

HotTag

Archive