1

This is similar to splitting a list of strings into a list of lists of strings, but I want a copy of the original string as an element of the list that came from it. The purpose is I want to parse out elements from a filename, but I want to retain the filename, so after I match the list using the words, the filename is readily available, so I can do something with it.

For example,

stringList = ["wordA1_wordA2_wordA3","wordB1_wordB2_wordB3"]

becomes

splitList = [["wordA1_wordA2_wordA3","wordA1","wordA2","wordA3"],
             ["wordB1_wordB2_wordB3","wordB1","wordB2","wordB3"]]

I'm trying to do it in a single command as a list comprehension

The closest I've gotten is:

splitList = [[item,item.split('_')] for item in stringList]

which yields:

splitList = [["wordA1_wordA2_wordA3",["wordA1","wordA2","wordA3"]],
             ["wordB1_wordB2_wordB3",["wordB1","wordB2","wordB3"]]

I could work with this, but is there a more elegant suggestion that I could learn from?

I've tried

splitList = [item.split('_') + item for item in stringList]

which complains about not concatenating a list to a str.

And

splitList = [item.split('_').append(item) for item in stringList]

which creates a list of 'None's.

1
  • what is the output you are expecting ? Commented May 21, 2019 at 17:00

2 Answers 2

2

You can unpack the split list with *:

splitList=[[item,*item.split('_')] for item in stringList]

which gives you the wanted

splitList = [["wordA1_wordA2_wordA3","wordA1","wordA2","wordA3"],
           ["wordB1_wordB2_wordB3","wordB1","wordB2","wordB3"]]

You can also do something like:

splitList=[[item] + item.split('_') for item in stringList]

to deal with the concatenation of string and list. [item] simply creates a list with single item item and concatenates it with the split list.

Sign up to request clarification or add additional context in comments.

Comments

1

The reason [item.split('_').append(item)...] returns None's is because list.append is an in-place modifier, and does not have a return value.

It might be a bit more advantageous to use a dict here, rather than a list of lists, since the filename can be your key, and the individual components can be your values:

stringList = ["wordA1_wordA2_wordA3","wordB1_wordB2_wordB3"]

string_dict = {filename: filename.split("_") for filename in stringList}

# {'wordA1_wordA2_wordA3': ['wordA1', 'wordA2', 'wordA3'], 'wordB1_wordB2_wordB3': ['wordB1', 'wordB2', 'wordB3']}

However, if you need a list:

processed_list = [[filename, *filename.split("_")] for filename in stringList]

# [['wordA1_wordA2_wordA3', 'wordA1', 'wordA2', 'wordA3'], ['wordB1_wordB2_wordB3', 'wordB1', 'wordB2', 'wordB3']]

Where [filename, *filename.split("_")] uses the * to unpack the resulting list from str.split into the current list

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.