14

I'm trying to convert a string which represents a JSON object to a real JSON object using json.loads but it doesn't convert the integers:

(in the initial string, integers are always strings)

$> python
Python 2.7.9 (default, Aug 29 2016, 16:00:38)
[GCC 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.31)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import json
>>> c = '{"value": "42"}'
>>> json_object = json.loads(c, parse_int=int)
>>> json_object
{u'value': u'42'}
>>> json_object['value']
u'42'
>>>

Instead of {u'value': u'42'} I'd like it becomes {u'value': 42}. I know I can run through the whole object, but I don't want to do that, it's not really efficient to do it manually, since this parse_int argument exists (https://docs.python.org/2/library/json.html#json.loads).

Thanks to Pierce's proposition:

Python 2.7.9 (default, Aug 29 2016, 16:00:38)
[GCC 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.31)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import json
>>>
>>> class Decoder(json.JSONDecoder):
...     def decode(self, s):
...         result = super(Decoder, self).decode(s)
...         return self._decode(result)
...     def _decode(self, o):
...         if isinstance(o, str) or isinstance(o, unicode):
...             try:
...                 return int(o)
...             except ValueError:
...                 try:
...                     return float(o)
...                 except ValueError:
...                     return o
...         elif isinstance(o, dict):
...             return {k: self._decode(v) for k, v in o.items()}
...         elif isinstance(o, list):
...             return [self._decode(v) for v in o]
...         else:
...             return o
...
>>>
>>> c = '{"value": "42", "test": "lolol", "abc": "43.4",  "dcf": 12, "xdf": 12.4}'
>>> json.loads(c, cls=Decoder)
{u'test': u'lolol', u'dcf': 12, u'abc': 43.4, u'value': 42, u'xdf': 12.4}
9
  • 5
    Why is it "42" instead of 42 in the first place? Commented Jul 12, 2017 at 23:01
  • 3
    Well your JSON example '{"value": "42"}' has 42 as a string — not an int. Your best bet is either to fix the data coming in or (if that's not feasible) write a custom JSON decoder. Commented Jul 12, 2017 at 23:01
  • The parse_int option is only used for parts of the JSON that have the syntax of an integer. The double quotes make it a string, not an integer, so it doesn't use the parse_int option. Commented Jul 12, 2017 at 23:07
  • @Barmar I'm a bit lost on that functionality. From all JSON I've worked with, 42 would be an int without parse_int and "42" would be a string. Do you have a link for a use-case on parse_int? Commented Jul 12, 2017 at 23:11
  • 2
    @roganjosh The documentation suggests this use case: This can be used to use another datatype or parser for JSON integers (e.g. float). Commented Jul 12, 2017 at 23:14

5 Answers 5

11

In addition to the Pierce response, I think you can use the json.loads object_hook parameter instead of cls one, so you don't need to walk the json object twice.

For example:

def _decode(o):
    # Note the "unicode" part is only for python2
    if isinstance(o, str) or isinstance(o, unicode):
        try:
            return int(o)
        except ValueError:
            return o
    elif isinstance(o, dict):
        return {k: _decode(v) for k, v in o.items()}
    elif isinstance(o, list):
        return [_decode(v) for v in o]
    else:
        return o

# Then you can do:
json.loads(c, object_hook=_decode)

As @ZhanwenChen pointed out in a comment, the code above is for python2. For python3 you'll need to remove the or isinstance(o, unicode) part in the first if condition.

Sign up to request clarification or add additional context in comments.

3 Comments

in Python 3, the str class subsumed the unicode class, so your code would raise because unicode is an undefined variable. Please edit your answer.
In the dict comp, if you might want to also decode the key you can use {_decode(k): ...}
object_hook will never be called if the string to be decoded contains no dict. For example '["3"]'. A hack would be to pass f'{{"dummykey":{stringtodecode}}} and return the first value of the decoded dict. Also I think returning json.loads(o, object_hook=_decode) and catching json.JSONDecodeError in the first if clause would cover the general type casting case
9

As we established in the comments, there is no existing functionality to do this for you. And I read through the documentation and some examples on the JSONDecoder and it also appears to not do what you want without processing the data twice.

The best option, then, is something like this:

class Decoder(json.JSONDecoder):
    def decode(self, s):
        result = super().decode(s)  # result = super(Decoder, self).decode(s) for Python 2.x
        return self._decode(result)

    def _decode(self, o):
        if isinstance(o, str) or isinstance(o, unicode):
            try:
                return int(o)
            except ValueError:
                return o
        elif isinstance(o, dict):
            return {k: self._decode(v) for k, v in o.items()}
        elif isinstance(o, list):
            return [self._decode(v) for v in o]
        else:
            return o

This has the downside of processing the JSON object twice — once in the super().decode(s) call, and again to recurse through the entire structure to fix things. Also note that this will convert anything which looks like an integer into an int. Be sure to account for this appropriately.

To use it, you do e.g.:

>>> c = '{"value": "42"}'
>>> json.loads(c, cls=Decoder)
{'value': 42}

4 Comments

Thank you Pierce you code seems right but it has some errors on result = super().decode(s)
@Léo I wrote this in Python 3; if you're using Python 2 you'd need result = super(Decoder, self).decode(s). If that's not the issue, can you tell me what error you're seeing and I can try to help you?
@Léo ah! I didn't realize it was doing unicode. I've updated my answer to accommodate the unicode handling, and it appears to work fine for me now!
Well. You saved my Day! Thx a lot. :)
5

For my solution I used object_hook, which is useful when you have nested json

>>> import json
>>> json_data = '{"1": "one", "2": {"-3": "minus three", "4": "four"}}'
>>> py_dict = json.loads(json_data, object_hook=lambda d: {int(k) if k.lstrip('-').isdigit() else k: v for k, v in d.items()})

>>> py_dict
{1: 'one', 2: {-3: 'minus three', 4: 'four'}}

There is a filter only for parsing a json key to int. You can use int(v) if v.lstrip('-').isdigit() else v to filter for json values too.

1 Comment

Excellent! It works like a charm. I wanted to transform str keys into int keys when possible. Thank you!
0

In addition to @juanra and therefore @Pierce Darragh I added a conversion for boolean values from string. My example is a dict converted from XML that contains 'true' and 'false' that won't be loaded as JSON-boolean True and False automatically with the suggested answers.

def _decode(o):
    if isinstance(o, str):
        if o.lower() == 'true':
            return True
        elif o.lower() == 'false':
            return False
        else:
            try:
                return int(o)
            except ValueError:
                return o
    elif isinstance(o, dict):
        return {k: _decode(v) for k, v in o.items()}
    elif isinstance(o, list):
        return [_decode(v) for v in o]
    else:
        return o

According what you need you can also include other strings for boolean conversion with Converting from a string to boolean in Python?

Comments

-3
def convert_to_int(params):
    for key in params.keys():
        if isinstance(params[key], dict):
            convert_to_int(params[key])
        elif isinstance(params[key], list):
            for item in params[key]:
                if not isinstance(item, (dict, list)):
                    item = int(item)
                else:
                    convert_to_int(item)
        else:
            params[key] = int(params[key])
    return params


print convert_to_int({'a': '3', 'b': {'c': '4', 'd': {'e': 5}, 'f': [{'g': '6'}]}})

5 Comments

The issue with this is that the OP wanted to parse the value "42" into an int in Python, which your code does not account for.
convert = lambda x: {x.keys()[0]: int(x.values()[0])} convert(json.loads('{"value": "42"}'))
Your lambda suggestion only works for dictionaries with exactly one value. It does not solve for multi-value dictionaries or arrays, and it also does not address the Unicode problem present in Python 2. Further, it isn't advisable to set the result of a lambda expression to a variable; just define a function and it becomes significantly easier to maintain. See my (accepted) answer for a more robust solution.
how about this one?
Better, but for key in params.keys() assumes that params is a dict, which it may not be since you call the function recursively. Additionally, you iterate through the keys of the dictionary and then continuously do params[key]. Why not for key, value in params.items() (or params.iteritems() in Python 2.x)? Additionally, much of this can be done with comprehensions — like what I did in my solution. I think list/dict comprehensions lead to easier-to-read code in cases like this.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.