59

I have a file consists of JSON, each a line, and want to sort the file by update_time reversed.

sample JSON file:

{ "page": { "url": "url1", "update_time": "1415387875"}, "other_key": {} }
{ "page": { "url": "url2", "update_time": "1415381963"}, "other_key": {} }
{ "page": { "url": "url3", "update_time": "1415384938"}, "other_key": {} }

want output:

{ "page": { "url": "url1", "update_time": "1415387875"}, "other_key": {} }
{ "page": { "url": "url3", "update_time": "1415384938"}, "other_key": {} }
{ "page": { "url": "url2", "update_time": "1415381963"}, "other_key": {} }

my code:

#!/bin/env python
#coding: utf8

import sys
import os
import json
import operator

#load json from file
lines = []
while True:
    line = sys.stdin.readline()
    if not line: break
    line = line.strip()
    json_obj = json.loads(line)
    lines.append(json_obj)

#sort json
lines = sorted(lines, key=lambda k: k['page']['update_time'], reverse=True)

#output result
for line in lines:
    print line

The code works fine with sample JSON file, but if a JSON has no 'update_time', it will raise KeyError exception. Are there non-exception ways to do this?

0

4 Answers 4

51

Write a function that uses try...except to handle the KeyError, then use this as the key argument instead of your lambda.

def extract_time(json):
    try:
        # Also convert to int since update_time will be string.  When comparing
        # strings, "10" is smaller than "2".
        return int(json['page']['update_time'])
    except KeyError:
        return 0

# lines.sort() is more efficient than lines = lines.sorted()
lines.sort(key=extract_time, reverse=True)
Sign up to request clarification or add additional context in comments.

1 Comment

This works but you should avoid using namespaces that are for built-in library.
35

You can use dict.get() with a default value:

lines = sorted(lines, key=lambda k: k['page'].get('update_time', 0), reverse=True)

Example:

>>> lines = [
...     {"page": {"url": "url1", "update_time": "1415387875"}, "other_key": {}},
...     {"page": {"url": "url2", "update_time": "1415381963"}, "other_key": {}},
...     {"page": {"url": "url3", "update_time": "1415384938"}, "other_key": {}},
...     {"page": {"url": "url4"}, "other_key": {}},
...     {"page": {"url": "url5"}, "other_key": {}}
... ]
>>> lines = sorted(lines, key=lambda k: k['page'].get('update_time', 0), reverse=True)
>>> for line in lines:
...     print line
... 
{'other_key': {}, 'page': {'url': 'url1', 'update_time': '1415387875'}}
{'other_key': {}, 'page': {'url': 'url3', 'update_time': '1415384938'}}
{'other_key': {}, 'page': {'url': 'url2', 'update_time': '1415381963'}}
{'other_key': {}, 'page': {'url': 'url4'}}
{'other_key': {}, 'page': {'url': 'url5'}}

Though, I would still follow the EAFP principle that Ferdinand suggested - this way you would also handle cases when page key is also missing. Much easier to let it fail and handle it than checking all sorts of corner cases.

2 Comments

how to assign json file to lines so that it must dynamically if I we 1 million lines then it will not load right so that's why
lines = sorted(json_string, key=lambda k: k['modified'], reverse=True) TypeError: string indices must be integers i am getting error.
18
# sort json
lines = sorted(lines, key=lambda k: k['page'].get('update_time', 0), reverse=True)

Comments

3
def get_sortest_key(a: dict, o: dict):
    v = None
    k = None
    for key, value in a.items():
        if v is None:
            v = value
            k = key
            continue
        if v > value:
            v = value
            k = key
    o.update({k: v})
    a.pop(k)
    if a:
        get_sortest_key(a, o)
    else:
        return


def call(o):
    a = {'a': 9, 'b': 1, 'c': 3, 'k': 3, 'l': -1, 's': 100}
    z = get_sortest_key(a, o)
    print(o)


o={}    
call(o)

1 Comment

Adding some explanation of how your code addresses the original problem would be useful.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.