0

I want to convert XML file to CSV using Java Code, i don't want to use XML Stylesheet(XSL) or XSLT. Here is my XML file.

<?xml version="1.0" encoding="UTF-8"?>
<PickAndPlace>
<Components>
    <Component id="1">
        <X_Dimension>4.33</X_Dimension>
        <Y_Dimension>2.962</Y_Dimension>
        <Designation>None</Designation>
        <Package>None</Package>
        <Angle>0</Angle>
    </Component>
    <Component id="5">
        <X_Dimension>4.33</X_Dimension>
        <Y_Dimension>8.692</Y_Dimension>
        <Designation>None</Designation>
        <Package>None</Package>
        <Angle>0</Angle>
    </Component>
    <Component id="9">
        <X_Dimension>4.33</X_Dimension>
        <Y_Dimension>14.381</Y_Dimension>
        <Designation>None</Designation>
        <Package>None</Package>
        <Angle>0</Angle>
    </Component>
</Components>
</PickAndPlace>

Here what i want as my CSV Output.

X_Dimension,Y_Dimension,Designation,Package,Angle,_id
4.33,2.962,None,None,0,1
4.33,8.692,None,None,0,5
4.33,14.381,None,None,0,9
6
  • 1
    I think you need to think about how you want to structure your csv file from this xml. Commented Aug 22, 2017 at 9:31
  • Can you provide us with the intended CSV output? Commented Aug 22, 2017 at 9:36
  • 1
    @ khriskooper and @Stefan I have shown my CSV Output. Commented Aug 22, 2017 at 9:43
  • Have you thought about using xpath to make reading XML values easier? Are the components in the correct order in the XML file? Commented Aug 22, 2017 at 9:52
  • @khriskooper Are the components in the correct order in the XML file? YES Commented Aug 22, 2017 at 9:57

1 Answer 1

2

You could read the file line-by-line, extracting only the data you need, and storing everything into a temporary LinkedList of Strings:

    LinkedList<String> tmpList = new LinkedList<String>();
    try (
        BufferedReader reader = Files.newBufferedReader(Paths.get("c:/tmp.xml"), Charset.forName("UTF-8"))) {
        String line = StringUtils.EMPTY;
        while ((line = reader.readLine()) != null) {
            if(line.contains("<Component id=")) {
                String _id = extractValue(line, "<Component id=\"", "\">");
                String _xDimension = extractValue(reader.readLine(), "<X_Dimension>", "</X_Dimension>");
                String _yDimension = extractValue(reader.readLine(), "<Y_Dimension>", "</Y_Dimension>");
                String _designation = extractValue(reader.readLine(), "<Designation>", "</Designation>");
                String _package = extractValue(reader.readLine(), "<Package>", "</Package>");
                String _angle = extractValue(reader.readLine(), "<Angle>", "</Angle>");
                tmpList.add(_xDimension + "," + _yDimension + "," + _designation + "," + _package + "," + _angle + "," + _id);
            }

        }
    } catch (IOException e) {
        System.err.println(e);
    }

This handy utility method will deal with extracting values for the above code. Note that it may need to be made more robust depending on your data and requirements, but it works fine for the sample-set you provided:

private static String extractValue(String line, String prefix, String postfix) {
    String value = line.trim().replaceAll(prefix, "");
    value = value.replaceAll(postfix, "");
    return value;
}

Once read, you could write the LinkedList of Strings to a new file:

    try{
        PrintWriter writer = new PrintWriter("c:/tmp.csv", "UTF-8");
        writer.println("X_Dimension,Y_Dimension,Designation,Package,Angle,_id");
        for(String line : tmpList) {
            writer.println(line);
        }
        writer.close();
    } catch (IOException e) {
        System.err.println(e);
    }

Of course, this method relies heavily on the XML data being consistently structured like this throughout.

As a final note, you could remove the need for the temporary list by writing out to a file directly, instead of adding values to a list first. It is nice to separate input and output in code though.

Sign up to request clarification or add additional context in comments.

5 Comments

it is really helpful. Thank you.
I'm glad this has helped you. Please consider accepting my answer in case it helps somebody else. Good luck!
This is horrendeous. Just parse it with JDOM 2 or something. This willingness to be as likely as possible to fail is not worth spreading.
@kumesana, I agree it would be better to use a time tested parsing library, if possible. I can't see anything wrong with this approach if the data is consistent enough though.
@khriskooper it is wrong to expect that kind of data consistency in the first place. When you're authoring XML documents you know that markup don't have to respect this indentation and therefore you wouldn't begin to care. Maybe it will be consistent, maybe it won't, and it's nobody's problem. When dealing with XML, you fail if you start to expect these things. It also fails to deal with stuff like character escapes and CDATA sections, which are perfectly acceptable XML and have no reason to be rejected, therefore not put there.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.