Remove json objects based on key value using Python

Question

EDIT: Forgot to mention I am using Python 2.7

I have a large json file strctured like this:

[{
"headline": "Algérie Télécom prolonge son dispositif spécial Covid-19",
"url_src": "https://www.algerie360.com/algerie-telecom-prolonge-son-dispositif-special-covid-19/",
"img_src": "https://www.algerie360.com/wp-content/uploads/2020/04/DIA-Iddom-Algérie-télécom-320x200.jpg",
"news_src": "Algérie 360",
"catPT": "Ciência e Tecnologia",
"catFR": "Science et Technologie",
"catEN": "Science and Technology",
"lang": "French",
"epoch": 1591293345.817
},
{
"headline": "Internet haut débit à Alger : Lancement de la généralisation du  » fibre to home »",
"url_src": "https://www.algerie360.com/20200510-internet-haut-debit-a-alger-lancement-de-la-generalisation-du-fibre-to-home/",
"img_src": "https://www.algerie360.com/wp-content/uploads/2020/05/unnamed-320x200.jpg",
"news_src": "Algérie 360",
"catPT": "Ciência e Tecnologia",
"catFR": "Science et Technologie",
"catEN": "Science and Technology",
"lang": "French",
"epoch": 1591283345.817
},
...

I've been trying to write a .py script that opens my json file, removes all objects where the "epoch" key is less than 1591293345.817, and overwrites the current file.

Is this possible at all?

I've tried the following but my python knowledge is sketchy at best:

import time
import os
import json
import jsonlines

json_lines = []
with open('./json/news_done.json', 'r') as open_file:
    for line in open_file.readlines():
        j = json.loads(line)
        now = time.time()
        print(j['epoch'])
        lastWeek = now - 3600
        if not j['{epoch}'] > lastWeek:
            json_lines.append(line)

with open('./json/news_done.json', 'w') as open_file:
    open_file.writelines('\n'.join(json_lines))

Is the file in "json-lines" format (i.e. each line is a separae JSON object) or is it just one big structure like you show in the question? — tzaman
– tzaman, Commented Jun 12, 2020 at 11:43

Castlstream · Accepted Answer · 2020-06-12 12:38:06Z

2

Have you tried pandas framework? You can easily filter your columns with it.

I got this code snippet work with your example data:

import pandas as pd
import json

dataset = pd.read_json('example.json')
new_dataset = dataset[dataset['epoch'] >= 1591293345.817]
final_data = new_dataset.to_json(orient='records')

with open('example.json', 'w') as f:
    json.dump(final_data, f)

edited Jun 12, 2020 at 12:38

answered Jun 12, 2020 at 12:32

Castlstream

215 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

maxnormal Over a year ago

I am getting the following error: ValueError: DataFrame constructor not properly called!

Castlstream Over a year ago

Could you give the whole error message? I wasn't able to trace back to that error

Ramtin Nouri · Accepted Answer · 2020-06-12 12:30:38Z

1

Looks like you're only removing the "epoch" tag but if I've understood correctly you want to dismiss the whole element

you can open the file entirely as a json instead of lines individually

import json,time
with open('./json/news_done.json', 'r') as open_file:
    yourFileRead = open_file.read()
    yourJson = json.loads(yourFileRead)

filteredList = []
for j in yourJson: # j is one element out of the list not only one line
   if time.time()-3600 > j['epoch']:
       filteredList.append(j)

with open('./json/news_done.json', 'w') as open_file:
    open_file.write(json.dumps(filteredList))

answered Jun 12, 2020 at 12:30

Ramtin Nouri

3201 silver badge8 bronze badges

2 Comments

maxnormal Over a year ago

I get the following error msg: Traceback (most recent call last): File "/_scrapyard OSX/x_punger.py", line 8, in <module> if time.time()-3600 > j['epoch']: TypeError: string indices must be integers [Finished in 0.635s]

Ramtin Nouri Over a year ago

weird looks as if j is a string. Does the whole list look like your example?

Collectives™ on Stack Overflow

Remove json objects based on key value using Python

2 Answers 2

2 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related