I have data that was pulled from an Excel file into a JSON structure using Java, Apache POI and Jackson. The resulting JSON data structure looks like the following:
{
"fileName" : "C:\\Users\\jgagnon\\sample_data\\PDM_BOM.xlsx",
"sheets" : [ {
"name" : "PDM_BOM",
"data" : [ [ "BRANCH", "PARENT ITEM NUMBER", "2ND ITEM NUMBER", "QUANTITY REQUIRED", "UNIT OF MEASURE", "ISSUE TYPE CODE", "LINE TYPE", "STOCK TYPE", "TYPE BOM", "LINE NUMBER", "OPERATING SEQUENCE", "EFFECTIVE FROM DATE", "EFFECTIVE THRU DATE", "DRAWING NUMBER", "UNIT COST", "SCRAP PERCENT" ],
[ "B20", "208E8840040", "5P884LPFSR2", 0.32031, "LB", "I", "S", "M", "M", 1.0, 10.0, "09/11/13", "12/31/40", null, 0.0, 0.0 ],
[ "B20", "208E8840168", "5P884LPFSR2", 1.36, "LB", "I", "S", "M", "M", 1.0, 10.0, "02/26/08", "12/31/40", null, 0.0, 0.0 ],
[ "B20", "208E8840172", "5P884LPFSR2", 1.3924, "LB", "I", "S", "M", "M", 1.0, 10.0, "02/26/08", "12/31/40", null, 0.0, 0.0 ],
[ "B20", "208E8840180", "5P884LPFSR2", 1.4565, "LB", "I", "S", "P", "M", 1.0, 10.0, "03/04/09", "12/31/40", null, 0.0, 0.0 ],
[ "B20", "21PPH150166", "8P315TPMRG", 1.39629, "LB", "I", "S", "M", "M", 1.0, 10.0, "03/05/18", "12/31/40", null, 0.0, 0.0 ] ],
"maxCols" : 16,
"maxRows" : 14996
} ]
}
In the data element, there is essentially an array of arrays, representing all of the rows from a worksheet. The first row array is the column headers for the data rows that follow.
I'd like to be able to restructure the data so that it's in the form of a map of objects, where each key is (for this example) the value in the PARENT ITEM NUMBER column. The map value associated with the key would be a JSON object consisting of the key/value pairs of column header and column value for the current data row.
So, using the example above, I'd end up with something like this: (I may have the JSON syntax/structuring incorrect)
{
"data": {
"208E8840040": {
"BRANCH": "B20",
"PARENT ITEM NUMBER": "208E8840040",
"2ND ITEM NUMBER": "5P884LPFSR2",
"QUANTITY REQUIRED": 0.32031,
"UNIT OF MEASURE": "LB",
"ISSUE TYPE CODE": "I",
"LINE TYPE": "S",
"STOCK TYPE": "M",
"TYPE BOM": "M",
"LINE NUMBER": 1.0,
"OPERATING SEQUENCE": 10.0,
"EFFECTIVE FROM DATE": "09/11/13",
"EFFECTIVE THRU DATE": "12/31/40",
"DRAWING NUMBER": null,
"UNIT COST": 0.0,
"SCRAP PERCENT": 0.0
},
"208E8840168": {
"BRANCH": "B20",
"PARENT ITEM NUMBER": "208E8840168",
"2ND ITEM NUMBER": "5P884LPFSR2",
"QUANTITY REQUIRED": 1.36,
...
},
...
}
}
I'm looking for a way to convert the former into the latter.
UPDATE:
I just realized I left out an important detail.
The data in this sheet (table), is essentially keyed by the PARENT ITEM NUMBER column. However, although that column is the primary identifier, it is not always unique in the table.
There are often cases, where there are multiple rows with the same value for PARENT ITEM NUMBER. Each of these rows contains information about elements that make up the "parent" (think of them as child items). These child items are identified by the 2ND ITEM NUMBER column.
Moreover, many child items will have their own row in the table where the PARENT ITEM NUMBER is the child item number identified by 2ND ITEM NUMBER for the parent. These child items, as you may have guessed, can have their own child items, and so on.
Basically, this is a tabular representation of multiple hierarchies of related data. Some child items will appear (are re-used) for multiple parent items.
I don't know in what ways this complicates what I'm trying to do.
UPDATE:
Thanks to https://stackoverflow.com/users/51591/micha%c5%82-ziober for the initial idea. I adapted it slightly to yield a map of lists of child items. The modified code follows:
public String convertToJson(File jsonFile) throws IOException {
ArrayNode arrayNode = readDataArray(jsonFile);
List<Map<String, JsonNode>> rowMaps = convertArrayToMaps(arrayNode);
Map<Object, List<Map<String, JsonNode>>> dataMap = rowMaps.stream()
.collect(Collectors.groupingBy(map -> map.get("PARENT ITEM NUMBER").textValue()));
return jsonMapper.writeValueAsString(Collections.singletonMap("data", dataMap));
}
Here's an example of the output:
{
"data" : {
"MTDMN97PJ1A9" : [ { <- 1 child
"BRANCH" : "B70",
"PARENT ITEM NUMBER" : "MTDMN97PJ1A9",
"2ND ITEM NUMBER" : "MTDMN970144XO",
"QUANTITY REQUIRED" : 12.0,
"UNIT OF MEASURE" : "SY",
"ISSUE TYPE CODE" : "I",
"LINE TYPE" : "S",
"STOCK TYPE" : "M",
"TYPE BOM" : "M",
"LINE NUMBER" : 1.0,
"OPERATING SEQUENCE" : 10.0,
"EFFECTIVE FROM DATE" : "01/18/19",
"EFFECTIVE THRU DATE" : "12/31/40",
"DRAWING NUMBER" : null,
"UNIT COST" : 0.0,
"SCRAP PERCENT" : 0.0
} ],
"ZCP723A1152" : [ { <- 4 children
"BRANCH" : "B70",
"PARENT ITEM NUMBER" : "ZCP723A1152",
"2ND ITEM NUMBER" : "5P587UMFSD2",
"QUANTITY REQUIRED" : 2.32222,
"UNIT OF MEASURE" : "LB",
"ISSUE TYPE CODE" : "I",
"LINE TYPE" : "S",
"STOCK TYPE" : "M",
"TYPE BOM" : "M",
"LINE NUMBER" : 3.0,
"OPERATING SEQUENCE" : 10.0,
"EFFECTIVE FROM DATE" : "05/15/17",
"EFFECTIVE THRU DATE" : "12/31/40",
"DRAWING NUMBER" : null,
"UNIT COST" : 0.0,
"SCRAP PERCENT" : 0.0
}, {
"BRANCH" : "B70",
"PARENT ITEM NUMBER" : "ZCP723A1152",
"2ND ITEM NUMBER" : "8P550ZPPOOLE",
"QUANTITY REQUIRED" : 2.32222,
"UNIT OF MEASURE" : "LB",
"ISSUE TYPE CODE" : "I",
"LINE TYPE" : "S",
"STOCK TYPE" : "M",
"TYPE BOM" : "M",
"LINE NUMBER" : 1.0,
"OPERATING SEQUENCE" : 10.0,
"EFFECTIVE FROM DATE" : "05/15/17",
"EFFECTIVE THRU DATE" : "12/31/40",
"DRAWING NUMBER" : null,
"UNIT COST" : 0.0,
"SCRAP PERCENT" : 0.0
}, {
"BRANCH" : "B70",
"PARENT ITEM NUMBER" : "ZCP723A1152",
"2ND ITEM NUMBER" : "8P906WPPA3077",
"QUANTITY REQUIRED" : 4.64444,
"UNIT OF MEASURE" : "LB",
"ISSUE TYPE CODE" : "I",
"LINE TYPE" : "S",
"STOCK TYPE" : "M",
"TYPE BOM" : "M",
"LINE NUMBER" : 2.0,
"OPERATING SEQUENCE" : 10.0,
"EFFECTIVE FROM DATE" : "05/15/17",
"EFFECTIVE THRU DATE" : "12/31/40",
"DRAWING NUMBER" : null,
"UNIT COST" : 0.0,
"SCRAP PERCENT" : 0.0
}, {
"BRANCH" : "B70",
"PARENT ITEM NUMBER" : "ZCP723A1152",
"2ND ITEM NUMBER" : "8U910LKSHBL",
"QUANTITY REQUIRED" : 2.32222,
"UNIT OF MEASURE" : "LB",
"ISSUE TYPE CODE" : "I",
"LINE TYPE" : "S",
"STOCK TYPE" : "M",
"TYPE BOM" : "M",
"LINE NUMBER" : 4.01,
"OPERATING SEQUENCE" : 10.0,
"EFFECTIVE FROM DATE" : "12/13/17",
"EFFECTIVE THRU DATE" : "12/31/40",
"DRAWING NUMBER" : null,
"UNIT COST" : 0.0,
"SCRAP PERCENT" : 0.0
} ],
... many more entries
}