2

I have a json String like below and I want to split/explode it in to multiple json string.

input:

{
    "name": "paddy",
    "age": 29,
    "cities": [
        {
            "cityName": "Chennai",
            "year": "2013-2015"
        },
        {
            "cityName": "Bangalore",
            "year": "2015-2019"
        }
    ]
}

And I want to convert in to two Json string

json 1

{
    "name": "paddy",
    "age": 29,
    "cities": [
        {
            "cityName": "Chennai",
            "year": "2013-2015"
        }
        ]
}

json 2

{
    "name": "paddy",
    "age": 29,
    "cities": [
        {
            "cityName": "Bangalore",
            "year": "2015-2019"
        }
    ]
}

As of now, my approach below using jackson library.

package com.test;

import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.SerializationFeature;
import com.fasterxml.jackson.databind.node.ArrayNode;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

public class JsonParser {

  public static void main(String[] args) throws IOException {

    String json =
        "{\n"
            + "\t\"name\": \"paddy\",\n"
            + "\t\"age\": 29,\n"
            + "\t\"cities\": [\n"
            + "\t\t{\n"
            + "\t\t\t\"cityName\": \"Chennai\",\n"
            + "\t\t\t\"year\": \"2013-2015\"\n"
            + "\t\t},\n"
            + "\t\t{\n"
            + "\t\t\t\"cityName\": \"Bangalore\",\n"
            + "\t\t\t\"year\": \"2015-2019\"\n"
            + "\t\t}\n"
            + "\t]\n"
            + "}";

    ObjectMapper mapper = new ObjectMapper();
    mapper.enable(SerializationFeature.INDENT_OUTPUT);

// Create a list to store the result (the list will store Jackson tree model objects)
    List<JsonNode> result = new ArrayList<>();


    JsonNode tree = mapper.readTree(json);
    JsonNode paths = tree.get("cities");
    Iterator<JsonNode> elements = paths.elements();

    while (elements.hasNext()) {


      JsonNode path = elements.next();


      // Create a copy of the tree
      JsonNode copyOfTree = mapper.valueToTree(tree);
      ((ArrayNode)copyOfTree.get("cities")).removeAll().add(path);


      // Add the modified tree to the result list
      result.add(copyOfTree);
    }

// Print the result
    for (JsonNode node : result) {
      System.out.println(mapper.writeValueAsString(node));
      System.out.println();
    }
  }
}

This above approach can work fine if the json is smaller. Is there any better approach to handle large json files. For example, assume the "cities" have million objects.

Thanks.

4
  • Please clarify the requirements. Your inner list has N entries ... and you want to duplicate that into N elements that all have exactly one of the list entries inside? (beyond the pure technical side: that approach sounds fishy ... as said: you are duplicating a ton of data. what is the purpose of doing so?) Commented Jul 29, 2019 at 11:02
  • @GhostCat I know I am doing this wrong. Thats why I need help. to split it in the efficient way Commented Jul 29, 2019 at 11:04
  • What I mean is: the whole approach sounds doubtful. Not just the implementation. I suggest that you rethink why you think you need to turn one JSON string into "potentially" millions, that are all the same, besides one inner entry. As in: if you do "the wrong thing", you shouldn't spend your time thinking "how to do the wrong thing more efficiently". Commented Jul 29, 2019 at 11:06
  • @GhostCat I am using Apache spark to process data. If I get the millions of String, I can parallely process those. I dont want to use the inbuilt Spark json processor. Commented Jul 29, 2019 at 11:14

1 Answer 1

1

There is many different factors you need to consider. First, do not copy the whole root object. In case, you have a big cities array you just waste a memory for creating new copy and remove all elements from it. See below example:

import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.node.ArrayNode;
import com.fasterxml.jackson.databind.node.ObjectNode;
import java.io.File;
import java.io.IOException;

public class JsonApp {

  public static void main(String[] args) throws IOException {
    File jsonFile = new File("./spring-basics/src/main/resources/test.json");

    ObjectMapper mapper = new ObjectMapper();

    // read whole JSON
    ObjectNode root = (ObjectNode) mapper.readTree(jsonFile);
    String citiesFieldName = "cities";

    // remove cities from root, now it contains only common properties
    ArrayNode cities = (ArrayNode) root.remove(citiesFieldName);
    cities.elements().forEachRemaining(item -> {
      // copy root
      ObjectNode copyOfRoot = root.deepCopy();
      // add one city to copy
      copyOfRoot.set(citiesFieldName, copyOfRoot.arrayNode().add(item));

      // add to result or send further
      System.out.println(copyOfRoot);
    });
  }
}

Above code copies root and adds one element to cities array. Now, we need to think what to do with result. You can send it immediately for next processing or store in list and send it in bulk operation. Another improvement could be splitting cities arrays on bigger chunks, more than 1 element. See this article, how to split list. For example, in case you have 1_000_000 elements, split it on list of 1_000 elements chunks.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.