How to create a list from splitting a list elements in python?

Question

Let's say I have:

sentences = ['The girls are gorgeous', 'I'm mexican']

And I want to obtain:

words = ['The','girls','are','gorgeous', 'I'm', 'mexican']

I tried:

words = [w.split(' ') for w in sentences]

but got not expected result.

Will this work for Counter(words) as I need to obtain the frequency?

'I'm mexican' is invalid syntax. use "I'm mexican" instead. — Ammar
– Ammar, Commented Apr 9, 2014 at 6:19

Syed Habib M · Accepted Answer · 2014-04-09 06:17:44Z

6

Try like this

sentences = ["The girls are gorgeous", "I'm mexican"]
words = [word for sentence in sentences for word in sentence.split(' ')]

answered Apr 9, 2014 at 6:17

Syed Habib M

1,8271 gold badge19 silver badges31 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

thefourtheye · Accepted Answer · 2014-04-09 06:37:03Z

4

Your method didn't work because, split returns a list. So, your code creates a nested list. You need to flatten it to use it with Counter. You can flatten it in so many ways.

from itertools import chain
from collections import Counter
Counter(chain.from_iterable(words))

would have been the best way to flatten the nested list and find the frequency. But you can use a generator expression, like this

sentences = ['The girls are gorgeous', "I'm mexican"]
from collections import Counter
print Counter(item for items in sentences for item in items.split())
# Counter({'mexican': 1, 'girls': 1, 'are': 1, 'gorgeous': 1, "I'm": 1, 'The':1})

This takes each sentence, splits that to get the list of words, iterates those words and flattens the nested structure.

If you want to find top 10 words, then you can use Counter.most_common method, like this

Counter(item for items in sentences for item in items.split()).most_common(10)

edited Apr 9, 2014 at 6:37

answered Apr 9, 2014 at 6:18

thefourtheye

241k53 gold badges466 silver badges505 bronze badges

5 Comments

diegoaguilar Over a year ago

What if I want to get the 10 most repeated right away?

thefourtheye Over a year ago

@diegoaguilar You can use most_common method like this print Counter(item for items in sentences for item in items.split()).most_common(10)

diegoaguilar Over a year ago

Nice @thefourtheye You should edit your question so it helps more people in future .. Thanks again deeply

thefourtheye Over a year ago

@diegoaguilar You meant the answer, right? ;) I updated it :)

diegoaguilar Over a year ago

God yeah, answer. I'm soo tired now and get confused

ooga · Accepted Answer · 2014-04-09 06:30:01Z

2

Try this:

words = ' '.join(sentences).split()

edited Apr 9, 2014 at 6:30

ooga

15.6k2 gold badges23 silver badges23 bronze badges

answered Apr 9, 2014 at 6:19

Nishant Nawarkhede

8,40813 gold badges61 silver badges83 bronze badges

Collectives™ on Stack Overflow

How to create a list from splitting a list elements in python?

3 Answers 3

Comments

5 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related