1

I have data that was pulled from an Excel file into a JSON structure using Java, Apache POI and Jackson. The resulting JSON data structure looks like the following:

{
  "fileName" : "C:\\Users\\jgagnon\\sample_data\\PDM_BOM.xlsx",
  "sheets" : [ {
    "name" : "PDM_BOM",
    "data" : [ [ "BRANCH", "PARENT ITEM NUMBER", "2ND ITEM NUMBER", "QUANTITY REQUIRED", "UNIT OF MEASURE", "ISSUE TYPE CODE", "LINE TYPE", "STOCK TYPE", "TYPE BOM", "LINE NUMBER", "OPERATING SEQUENCE", "EFFECTIVE FROM DATE", "EFFECTIVE THRU DATE", "DRAWING NUMBER", "UNIT COST", "SCRAP PERCENT" ],
        [ "B20", "208E8840040", "5P884LPFSR2", 0.32031, "LB", "I", "S", "M", "M", 1.0, 10.0, "09/11/13", "12/31/40", null, 0.0, 0.0 ],
        [ "B20", "208E8840168", "5P884LPFSR2", 1.36, "LB", "I", "S", "M", "M", 1.0, 10.0, "02/26/08", "12/31/40", null, 0.0, 0.0 ],
        [ "B20", "208E8840172", "5P884LPFSR2", 1.3924, "LB", "I", "S", "M", "M", 1.0, 10.0, "02/26/08", "12/31/40", null, 0.0, 0.0 ],
        [ "B20", "208E8840180", "5P884LPFSR2", 1.4565, "LB", "I", "S", "P", "M", 1.0, 10.0, "03/04/09", "12/31/40", null, 0.0, 0.0 ],
        [ "B20", "21PPH150166", "8P315TPMRG", 1.39629, "LB", "I", "S", "M", "M", 1.0, 10.0, "03/05/18", "12/31/40", null, 0.0, 0.0 ] ],
    "maxCols" : 16,
    "maxRows" : 14996
  } ]
}

In the data element, there is essentially an array of arrays, representing all of the rows from a worksheet. The first row array is the column headers for the data rows that follow.

I'd like to be able to restructure the data so that it's in the form of a map of objects, where each key is (for this example) the value in the PARENT ITEM NUMBER column. The map value associated with the key would be a JSON object consisting of the key/value pairs of column header and column value for the current data row.

So, using the example above, I'd end up with something like this: (I may have the JSON syntax/structuring incorrect)

{
  "data": {
    "208E8840040": {
      "BRANCH": "B20",
      "PARENT ITEM NUMBER": "208E8840040",
      "2ND ITEM NUMBER": "5P884LPFSR2",
      "QUANTITY REQUIRED": 0.32031,
      "UNIT OF MEASURE": "LB",
      "ISSUE TYPE CODE": "I",
      "LINE TYPE": "S",
      "STOCK TYPE": "M",
      "TYPE BOM": "M", 
      "LINE NUMBER": 1.0, 
      "OPERATING SEQUENCE": 10.0, 
      "EFFECTIVE FROM DATE": "09/11/13", 
      "EFFECTIVE THRU DATE": "12/31/40", 
      "DRAWING NUMBER": null, 
      "UNIT COST": 0.0, 
      "SCRAP PERCENT": 0.0
    },
    "208E8840168": {
      "BRANCH": "B20",
      "PARENT ITEM NUMBER": "208E8840168",
      "2ND ITEM NUMBER": "5P884LPFSR2",
      "QUANTITY REQUIRED": 1.36,
      ...
    },
    ...
  }
}

I'm looking for a way to convert the former into the latter.

UPDATE:

I just realized I left out an important detail.

The data in this sheet (table), is essentially keyed by the PARENT ITEM NUMBER column. However, although that column is the primary identifier, it is not always unique in the table.

There are often cases, where there are multiple rows with the same value for PARENT ITEM NUMBER. Each of these rows contains information about elements that make up the "parent" (think of them as child items). These child items are identified by the 2ND ITEM NUMBER column.

Moreover, many child items will have their own row in the table where the PARENT ITEM NUMBER is the child item number identified by 2ND ITEM NUMBER for the parent. These child items, as you may have guessed, can have their own child items, and so on.

Basically, this is a tabular representation of multiple hierarchies of related data. Some child items will appear (are re-used) for multiple parent items.

I don't know in what ways this complicates what I'm trying to do.

UPDATE:

Thanks to https://stackoverflow.com/users/51591/micha%c5%82-ziober for the initial idea. I adapted it slightly to yield a map of lists of child items. The modified code follows:

  public String convertToJson(File jsonFile) throws IOException {
    ArrayNode arrayNode = readDataArray(jsonFile);

    List<Map<String, JsonNode>> rowMaps = convertArrayToMaps(arrayNode);

    Map<Object, List<Map<String, JsonNode>>> dataMap = rowMaps.stream()
        .collect(Collectors.groupingBy(map -> map.get("PARENT ITEM NUMBER").textValue()));
    return jsonMapper.writeValueAsString(Collections.singletonMap("data", dataMap));
  }

Here's an example of the output:

{
  "data" : {
    "MTDMN97PJ1A9" : [ {    <- 1 child
      "BRANCH" : "B70",
      "PARENT ITEM NUMBER" : "MTDMN97PJ1A9",
      "2ND ITEM NUMBER" : "MTDMN970144XO",
      "QUANTITY REQUIRED" : 12.0,
      "UNIT OF MEASURE" : "SY",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "M",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 1.0,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "01/18/19",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    } ],
    "ZCP723A1152" : [ {     <- 4 children
      "BRANCH" : "B70",
      "PARENT ITEM NUMBER" : "ZCP723A1152",
      "2ND ITEM NUMBER" : "5P587UMFSD2",
      "QUANTITY REQUIRED" : 2.32222,
      "UNIT OF MEASURE" : "LB",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "M",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 3.0,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "05/15/17",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    }, {
      "BRANCH" : "B70",
      "PARENT ITEM NUMBER" : "ZCP723A1152",
      "2ND ITEM NUMBER" : "8P550ZPPOOLE",
      "QUANTITY REQUIRED" : 2.32222,
      "UNIT OF MEASURE" : "LB",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "M",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 1.0,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "05/15/17",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    }, {
      "BRANCH" : "B70",
      "PARENT ITEM NUMBER" : "ZCP723A1152",
      "2ND ITEM NUMBER" : "8P906WPPA3077",
      "QUANTITY REQUIRED" : 4.64444,
      "UNIT OF MEASURE" : "LB",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "M",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 2.0,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "05/15/17",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    }, {
      "BRANCH" : "B70",
      "PARENT ITEM NUMBER" : "ZCP723A1152",
      "2ND ITEM NUMBER" : "8U910LKSHBL",
      "QUANTITY REQUIRED" : 2.32222,
      "UNIT OF MEASURE" : "LB",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "M",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 4.01,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "12/13/17",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    } ],
    ... many more entries
}

1 Answer 1

1

You can read whole JSON payload as JsonNode and retrieve data property which is an array. If we can assume that always first element is array of names and all other elements are equal in size you can create Map manually and serialise it back to JSON. One tricky part is how to generate unique key value for the result JSON. Example code could look like below:

import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.SerializationFeature;
import com.fasterxml.jackson.databind.node.ArrayNode;

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Iterator;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.function.Function;
import java.util.stream.Collectors;

public class Excel2JonsApp {
    public static void main(String[] args) throws IOException {
        File jsonFile = new File("./resource/test.json").getAbsoluteFile();

        Excel2JsonMapper mapper = new Excel2JsonMapper();
        String json = mapper.convertToJson(jsonFile);
        System.out.println(json);
    }
}

class Excel2JsonMapper {
    private final ObjectMapper jsonMapper;

    Excel2JsonMapper() {
        this.jsonMapper = createJsonMapper();
    }

    private ObjectMapper createJsonMapper() {
        ObjectMapper mapper = new ObjectMapper();
        mapper.enable(SerializationFeature.INDENT_OUTPUT);

        return mapper;
    }

    String convertToJson(File jsonFile) throws IOException {
        ArrayNode arrayNode = readDataArray(jsonFile);
        Map<Object, Map<String, JsonNode>> dataMap = convertArrayToMaps(arrayNode)
                .stream()
                .collect(Collectors.toMap(
                        map -> map.get("PARENT ITEM NUMBER").textValue(), //key generator function
                        Function.identity()));

        return jsonMapper.writeValueAsString(Collections.singletonMap("data", dataMap));
    }

    private List<Map<String, JsonNode>> convertArrayToMaps(ArrayNode arrayNode) {
        ArrayNode names = (ArrayNode) arrayNode.get(0);

        Iterator<JsonNode> iterator = arrayNode.iterator();
        iterator.next();// skip names

        List<Map<String, JsonNode>> list = new ArrayList<>();
        while (iterator.hasNext()) {
            ArrayNode values = (ArrayNode) iterator.next();
            Map<String, JsonNode> map = new LinkedHashMap<>();
            for (int i = 0; i < names.size(); i++) {
                map.put(names.get(i).textValue(), values.get(i));
            }
            list.add(map);
        }
        return list;
    }

    private ArrayNode readDataArray(File jsonFile) throws IOException {
        JsonNode root = jsonMapper.readTree(jsonFile);
        return (ArrayNode) root.at("/sheets/0/data");
    }
}

Above code prints:

{
  "data" : {
    "208E8840168" : {
      "BRANCH" : "B20",
      "PARENT ITEM NUMBER" : "208E8840168",
      "2ND ITEM NUMBER" : "5P884LPFSR2",
      "QUANTITY REQUIRED" : 1.36,
      "UNIT OF MEASURE" : "LB",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "M",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 1.0,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "02/26/08",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    },
    "21PPH150166" : {
      "BRANCH" : "B20",
      "PARENT ITEM NUMBER" : "21PPH150166",
      "2ND ITEM NUMBER" : "8P315TPMRG",
      "QUANTITY REQUIRED" : 1.39629,
      "UNIT OF MEASURE" : "LB",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "M",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 1.0,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "03/05/18",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    },
    "208E8840180" : {
      "BRANCH" : "B20",
      "PARENT ITEM NUMBER" : "208E8840180",
      "2ND ITEM NUMBER" : "5P884LPFSR2",
      "QUANTITY REQUIRED" : 1.4565,
      "UNIT OF MEASURE" : "LB",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "P",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 1.0,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "03/04/09",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    },
    "208E8840172" : {
      "BRANCH" : "B20",
      "PARENT ITEM NUMBER" : "208E8840172",
      "2ND ITEM NUMBER" : "5P884LPFSR2",
      "QUANTITY REQUIRED" : 1.3924,
      "UNIT OF MEASURE" : "LB",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "M",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 1.0,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "02/26/08",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    },
    "208E8840040" : {
      "BRANCH" : "B20",
      "PARENT ITEM NUMBER" : "208E8840040",
      "2ND ITEM NUMBER" : "5P884LPFSR2",
      "QUANTITY REQUIRED" : 0.32031,
      "UNIT OF MEASURE" : "LB",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "M",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 1.0,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "09/11/13",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    }
  }
}
Sign up to request clarification or add additional context in comments.

4 Comments

This was helpful. You showed me some neat tricks I did not know about. There is a problem though. Because the PARENT ITEM NUMBER is not always unique (see my explanation in the posting), I'm running into an issue where it's trying to insert duplicate keys into dataMap. I think dataMap values need to be a list of "row" entries, representing the child items (idenitfied by 2ND ITEM NUMBER).
@JosephGagnon, I do not understand what do you mean by child and parent items. In case of duplicated keys you need to use a list and save all duplicated objects. After that you can iterate over list and add to a Map using some other method to generate keys. You can create new question, link to this one, and describe what to you want to achieve. Also provide an example how it should work for better understanding.
I was able to find a solution using your example. See my update in the original post.
@JosephGagnon, I'm glad it helped. Now I understand, you could say you want to have a list under a given key 😀

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.