3

I am importing and manipulating some deeply nested JSON (imported as a dictionary). It can assign the values just fine using code like:

query['query']['function_score']['query']['multi_match']['operator'] = 'or'
query['query']['function_score']['query']['multi_match'].update({
        'minimum_should_match' : '80%' })  

But it's ugly and cumbersome as nuts. I'm wondering if there's a cleaner way to assign values to deep-nested keys that's reasonably efficient?

I've read about possibly using an in-memory SQLlite db, but the data is going back into json after a bit of manipulation.

4
  • You can assign one of the inners dictionaries to variable and manipulate it: d = query[1][2][3][4]; d[5].update(..) Commented Dec 15, 2016 at 6:20
  • 2
    Don't use such deeply nested dictionariess. Commented Dec 15, 2016 at 6:20
  • @Natecat: why not? what is the alternative? If you do have a lengthy configuration to do, why not? Commented Aug 3, 2017 at 15:30
  • See also: stackoverflow.com/questions/7681301/… stackoverflow.com/a/16508328/42223 Commented Oct 30, 2017 at 19:56

4 Answers 4

4
multi_match = query['query']['function_score']['query']['multi_match']
multi_match['operator'] = 'or'
multi_match.update({'minimum_should_match' : '80%' })
Sign up to request clarification or add additional context in comments.

Comments

3

JSONPath (via 'jsonpath_rw') makes it less cumbersome:

Previous:

>>> query
{u'query': {u'function_score': {u'query': {u'multi_match': {u'min_should_match': u'20%'}}}}}

Update:

>>> found = jsonpath_rw.parse("$..multi_match").find(query)[0]
>>> found.value["operator"] == "or"
>>> found.value["min_should_match"] = "80%"`

Afterwards:

>>> query
{u'query': {u'function_score': {u'query': {u'multi_match': {'min_should_match': '80%', u'operator': u'or'}}}}}

Comments

2

The chosen answer is definitely the way to go. The problem I (later) found is that my nested key can appear at an varying levels. So I needed to be able to traverse the dict and find the path to the node first, and THEN do the update or addition.

jsonpath_rw was the immediate solution, but I got some strange results trying to use it. I gave up after a couple hours of wrestling with it.

At the risk of getting shot down for being a clunky newb, I did end up fleshing out a few functions (based on other code I found on SO) that natively do some nice things to address my needs:

def find_in_obj(obj, condition, path=None):
    ''' generator finds full path to nested dict key when key is at an unknown level 
        borrowed from http://stackoverflow.com/a/31625583/5456148'''
    if path is None:
        path = []

    # In case this is a list
    if isinstance(obj, list):
        for index, value in enumerate(obj):
            new_path = list(path)
            new_path.append(index)
            for result in find_in_obj(value, condition, path=new_path):
                yield result

    # In case this is a dictionary
    if isinstance(obj, dict):
        for key, value in obj.items():
            new_path = list(path)
            new_path.append(key)
            for result in find_in_obj(value, condition, path=new_path):
                yield result

            if condition == key:
                new_path = list(path)
                new_path.append(key)
                yield new_path


def set_nested_value(nested_dict, path_list, key, value):
    ''' add or update a value in a nested dict using passed list as path
        borrowed from http://stackoverflow.com/a/11918901/5456148'''
    cur = nested_dict
    path_list.append(key)
    for path_item in path_list[:-1]:
        try:
            cur = cur[path_item]
        except KeyError:
            cur = cur[path_item] = {}

    cur[path_list[-1]] = value
    return nested_dict


def update_nested_dict(nested_dict, findkey, updatekey, updateval):
    ''' finds and updates values in nested dicts with find_in_dict(), set_nested_value()'''
    return set_nested_value(
        nested_dict,
        list(find_in_obj(nested_dict, findkey))[0],
        updatekey,
        updateval
    )

find_in_obj() is a generator that finds a path to a given nested key.

set_nested_values() will either update key/value in dict with given list or add it if it doesn't exist.

update_nested_dict() is a "wrapper" for the two functions that takes in the nested dict to search, the key you're looking for and the key value to update (or add if it doesn't exist).

So I can pass in:

q = update_nested_dict(q, 'multi_match', 'operator', 'or')
q = update_nested_dict(q, 'multi_match', 'minimum_should_match', '80%')

And the "operator" value is updated, and the 'minimum_should_match' key/value is added under the 'multi_match' node, no matter what level it appears in the dictionary.

Might run into problems if the searched key exists in more than 1 place in the dictionary though.

Comments

1

You can write your own setnesteditem function once and for all:

def setnesteditem(xs, path, newvalue):
    for i in path[:-1]:
        xs = xs[i]
    xs[path[-1]] = newvalue

And then call it on any nested structure, including lists and dicts:

query = {'query': {'function_score': {'query': {'multi_match': {}}}}}

prefix = ('query', 'function_score', 'query', 'multi_match')
setnesteditem(query, (*prefix, 'operator'), 'or')
setnesteditem(query, (*prefix, 'minimum_should_match'), '80%')

print(query)
# {'query': {'function_score': {'query': {'multi_match': {'operator': 'or', 'minimum_should_match': '80%'}}}}}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.