2

I have a string that i get from git when i do this

> git show xxxx | head -3

commit 34343asdfasdf343434asdfasdfas
Author: John Doe <[email protected]>
Date:   Wed Jun 25 09:51:49 2014 +0800

I need to convert this string that prints out to console into a json format using python so the format would be (i am writing this in a script)

{'commit': '34343asdfasd343adfas', 'Author': 'john doe', 'date': 'wed jun 25'} 

Currently I am trying to manually split by the first space in the string.

3 Answers 3

3

This works on your example:

>>> txt='''\
... commit 34343asdfasdf343434asdfasdfas
... Author: John Doe <[email protected]>
... Date:   Wed Jun 25 09:51:49 2014 +0800'''
>>> json.dumps({k:v for k,v in re.findall(r'^([^\s]+)\s+(.+?)$', txt, re.M)})
{"commit": "34343asdfasdf343434asdfasdfas", "Date:": "Wed Jun 25 09:51:49 2014 +0800", "Author:": "John Doe <[email protected]>"}

If you have the git... part, just split it off:

>>> json.dumps({k:v for k,v in re.findall(r'^([^\s]+)\s+(.+?)$', 
                           txt.partition('\n\n')[2], re.M)})

And if you want to loose the : just change the regex capturing group to say so:

>>> json.dumps({k:v for k,v in re.findall(r'^(\w+):?\s+(.+?)$', 
                          txt.partition('\n\n')[2], re.M)})
{"Date": "Wed Jun 25 09:51:49 2014 +0800", "commit": "34343asdfasdf343434asdfasdfas", "Author": "John Doe <[email protected]>"}

And if you want to loose the email address:

>>> json.dumps({k:v for k,v in re.findall(r'^(\w+):?\s+(.+?)(?:\s*<[^>]*>)?$', 
                     txt.partition('\n\n')[2], re.M)})
{"Date": "Wed Jun 25 09:51:49 2014 +0800", 
 "commit": "34343asdfasdf343434asdfasdfas", "Author": "John Doe"}
Sign up to request clarification or add additional context in comments.

2 Comments

nice way slicker than mine
@dawg OP asked for a bit customized values, not all following end of lines after simple split.
0
import subprocess
results = subproccess.Popen("git show xxxx | head -3",stdout=subprocess.PIPE,shell=True).communicate()[0].strip()
data = {}
for line in results.splitlines():
     key,value = line.split(" ",1)
     data[re.sub("[^a-zA-Z]","",key)] = value
json.dumps(data)

1 Comment

Um yeah, not really sure why this is here!
0

Trying to follow exact format of values

OP asks for a bit customized values, which are not exact values left after removing labeling word. For that purpose we have to tackle line by line separately.

Take the text input and split it to lines:

>>> text = """commit 34343asdfasdf343434asdfasdfas
... Author: John Doe <[email protected]>
... Date:   Wed Jun 25 09:51:49 2014 +0800"""
...
>>> lines = text.split("\n")
>>> lines
['commit 34343asdfasdf343434asdfasdfas',
 'Author: John Doe <[email protected]>',
 'Date:   Wed Jun 25 09:51:49 2014 +0800']

Assume fixed order of lines and process part by part:

>>> dct = {}
>>> dct["commit"] = lines[0].split()[1]
>>> dct["Author"] = lines[1].split(": ")[1].split(" <")[0]
>>> dct["Date"] = " ".join(lines[2].split(": ")[1].split()[:3])

creating final dictionary:

>>> dct
{'Author': 'John Doe',
 'Date': 'Wed Jun 25',
 'commit': '34343asdfasdf343434asdfasdfas'}

which can be dumped to a string:

>>> import json
>>> json.dumps(dct)
'{"Date": "Wed Jun 25", "commit": "34343asdfasdf343434asdfasdfas", "Author": "John Doe"}'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.