0

I have a list in my python code with the following structure:

file_info = ['{file:C:\\samples\\123.exe, directory:C:\\}','{file:C:\\samples\\345.exe, directory:C:\\}',...]

I want to extract just the file and directory values for every value of the list and print it. With the following code, I am able to extract the directory values:

for item in file_info:

    print item.split('directory:')[1].strip('}')

But I am not able to figure out a way to extract the 'file' values. The following doesn't work:

print item.split('file:')[1].strip(', directory:C:\}')

Suggestions? If there is any better method to extract the file and directory values other than this, that would be great too. Thanks in advance.

5
  • 1
    Could you provide an example file_info? Commented Apr 2, 2014 at 19:16
  • 3
    What?! Does [{file:file1, directory... etc. have quotes around it? You write that you have a list of dicts and treat it like a string! Commented Apr 2, 2014 at 19:16
  • Is there any reason you're not using a list of dicts and using what seems to be a string instead? Commented Apr 2, 2014 at 19:19
  • Sorry, forgot the quotes.. it's not a list of dicts, it's just a list of strings. So it's file_info = ['{file:file1, directory:dir1}','{file:file2, directory:directory2}',...] Commented Apr 2, 2014 at 19:22
  • @user2251144 ok, list of strings. A simple example would help a lot. Thanks. Commented Apr 2, 2014 at 19:25

2 Answers 2

3

If the format is exactly the same you've provided, you'd better go with using re:

import re

file_info = ['{file:file1, directory:dir1}', '{file:file2, directory:directory2}']

pattern = re.compile(r'\w+:(\w+)')
for item in file_info:
    print re.findall(pattern, item)

or, using string replace(), strip() and split() (a bit hackish and fragile):

file_info = ['{file:file1, directory:dir1}', '{file:file2, directory:directory2}']

for item in file_info:
    item = item.strip('}{').replace('file:', '').replace('directory:', '')
    print item.split(', ')

both code snippets print:

['file1', 'dir1']
['file2', 'directory2']

If the file_info items are just dumped json items (watch the double quotes), you can use json to load them into dictionaries:

import json

file_info = ['{"file":"file1", "directory":"dir1"}', '{"file":"file2", "directory":"directory2"}']

for item in file_info:
    item = json.loads(item)
    print item['file'], item['directory']

or, literal_eval():

from ast import literal_eval

file_info = ['{"file":"file1", "directory":"dir1"}', '{"file":"file2", "directory":"directory2"}']

for item in file_info:
    item = literal_eval(item)
    print item['file'], item['directory']

both code snippets print:

file1 dir1
file2 directory2

Hope that helps.

Sign up to request clarification or add additional context in comments.

9 Comments

Thank you for the suggestions! I think the json.loads might be more relevant to my code, so I'll try to restructure file_info. Thanks again :)
The solutions with json.loads and ast.literal_eval are incorrect since the treatment shown by OP in his/her question reveals that the strings are not as you modified them to make json.load and ast.literal_eval to work.
@eyquem probably. But, as you can see, I've asked for the example twice :) That's why I've provided several options to choose from depends on the input format. Thanks.
@user2251144 If you can restructure file_info , you should do it avoiding to repeat 'file' and 'directory' that doesn't bring any interesting information because this info is contained in the fact that the file is always in first position and the directory in second position in the string. In my opinion, you're not in a good process of solving a problem: you try to solve it at the level of the consequences while the problem IMO is at the definition of file_info level. I may be wrong. But that's the characteristics of XY problem to not give full information, origine and consequences.
Yes , you asked two times. And (s)he didn't edit its question. I don't think it's a good practice to give quantities of soluces to balance the lack of a correct question. What if the file_info is composed of '{"file":file1, "directory":"dir1"}' ? You should provide a soluce for this case too.
|
0

I would do:

import re

regx = re.compile('{\s*file\s*:\s*([^,\s]+)\s*'
                  ','
                  '\s*directory\s*:\s*([^}\s]+)\s*}')

file_info = ['{file:C:\\samples\\123.exe, directory  :  C:\\}',
             '{  file:  C:\\samples\\345.exe,directory:C:\\}'
             ]

for item in file_info:
    print '%r\n%s\n' % (item,
                        regx.search(item).groups())

result

'{file:C:\\samples\\123.exe, directory  :  C:\\}'
('C:\\samples\\123.exe', 'C:\\')

'{  file:  C:\\samples\\345.exe,directory:C:\\}'
('C:\\samples\\345.exe', 'C:\\')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.