1

I have this list of lists in Python:

[[100,XHS,0],
[100,34B,3],
[100,42F,1],
[101,XHS,2],
[101,34B,5],
[101,42F,2],
[102,XHS,1],
[102,34B,2],
[102,42F,0],
[103,XHS,0],
[103,34B,4],
[103,42F,2]]

and I would like to find the most efficient way (I'm dealing with a lot of data) to create a new list of lists using the last element from each list for each id (the first element).. So for the sample list above, my result would be:

[[0,3,1],
[2,5,2],
[1,2,0],
[0,4,2]]

How can I implement this in Python? Thanks

7
  • 2
    How do you know the size of each sublist should be 3? Commented Aug 2, 2013 at 13:25
  • 3
    FYI that list isn't valid Python - the item in the middle should be quoted. Commented Aug 2, 2013 at 13:26
  • each sublist contains only 3 elements..the ID, a code, and the occurence of that code for each Id...I want to take the count of each code for each id and create n count vectors where n is the number of unique IDs (e.g.100,101 etc) Commented Aug 2, 2013 at 13:28
  • 1
    The second item of each sublist has to be a string as it contains alphanumeric, otherwise python throws an error. Commented Aug 2, 2013 at 13:31
  • @thegrinner It could be valid Python. How do you know XHS isn't a name? Commented Aug 2, 2013 at 13:39

5 Answers 5

8

An itertools approach with the building blocks broken out - get last elements, group into threes, convert groups of 3 into a list...

from operator import itemgetter
from itertools import imap, izip

last_element = imap(itemgetter(-1), a)
in_threes = izip(*[iter(last_element)] * 3)
res = map(list, in_threes)
# [[0, 3, 1], [2, 5, 2], [1, 2, 0], [0, 4, 2]]

However, it looks like you want to "group" on the first element (instead of purely blocks of 3 consecutive items), so you can use defaultdict for this:

from collections import defaultdict
dd = defaultdict(list)
for el in a:
    dd[el[0]].append(el[-1])

# defaultdict(<type 'list'>, {100: [0, 3, 1], 101: [2, 5, 2], 102: [1, 2, 0], 103: [0, 4, 2]})
Sign up to request clarification or add additional context in comments.

2 Comments

I totally missed the groups.
@MartijnPieters the OP snuck it in as a comment... (well just re-inforced the somewhat subtle way of describing it in the question)
2

You are trying to do two things here:

  • Get the last element of each nested list.
  • Group those elements by the first element of each nested list.

You can use list comprehension to get the last element of each nested list:

last_elems = [sublist[-1] for sublist in outerlist]

If the whole list is sorted by the first element (the id) then you can use itertools.groupby to do the second part:

from itertools import groupby
from operator import itemgetter

[[g[-1] for g in group] for id_, group in groupby(outerlist, key=itemgetter(0))]

Demo:

>>> outerlist = [
...     [100,'XHS',0],
...     [100,'34B',3],
...     [100,'42F',1],
...     [101,'XHS',2],
...     [101,'34B',5],
...     [101,'42F',2],
...     [102,'XHS',1],
...     [102,'34B',2],
...     [102,'42F',0],
...     [103,'XHS',0],
...     [103,'34B',4],
...     [103,'42F',2]
... ]
>>> from itertools import groupby
>>> from operator import itemgetter
>>> [[g[-1] for g in group] for id_, group in groupby(outerlist, key=itemgetter(0))]
[[0, 3, 1], [2, 5, 2], [1, 2, 0], [0, 4, 2]]

If it wasn't sorted, you'd either have to sort it first (using outerlist.sort(key=itemgetter)), or, if you don't need a sorted version anywhere else, use a collections.defaultdict approach to grouping:

from collections import defaultdict

grouped = defaultdict(list)
for sublist in outerlist:
    grouped[sublist[0]].append(sublist[-1])

output = grouped.values()

3 Comments

I like this answer, but is this the most efficient way to do this? It is definitely a very concise way to do it.
@MohammadS.: it is more efficient than using zip(*outerlist)[0] in that it doesn't build new tuples for the discarded columns.
It's amazing how many people want to re-answer the how do you split a list question.
2
new_list = []
temp_list = []
counter = 1

for x in list:
  temp_list.extend(x[-1])
  if ((counter % 3) == 0):
    new_list.append(temp_list)
    temp_list = []
  counter += 1
print new_list

5 Comments

This doesn't get his answer. Look at his required output
There we go. That's better. |=^)
…This is odd. I downvoted this before your edit, but SO won't let me remove the downvote. It thinks I downvoted it after your answer. To meta!
I had to do a bogus edit in order to remove my downvote.
@kojiro: edits during the 5 minute grace period do not reset vote locks.
1

If you don't know how many items are for each key and items for each key go consecutively in the original list, you can use groupby:

>>> from itertools import groupby,izip
>>> from operator import itemgetter
>>> [map(itemgetter(-1),it) for key,it in groupby(L,itemgetter(0))]
[[0, 3, 1], [2, 5, 2], [1, 2, 0], [0, 4, 2]]

Explanation

Each it is an iterator over items with the same key:

>>> [list(it) for key,it in groupby(L,itemgetter(0))]
[[[100, 'XHS', 0], [100, '34B', 3], [100, '42F', 1]], [[101, 'XHS', 2], [101, '34B', 5], [101, '42F', 2]], [[102, 'XHS', 1], [102, '34B', 2], [102, '42F', 0]], [[103, 'XHS', 0], [103, '34B', 4], [103, '42F', 2]]]

map just takes the last element from each sublist:

>>> [map(itemgetter(-1),it) for key,it in groupby(L,itemgetter(0))]
[[0, 3, 1], [2, 5, 2], [1, 2, 0], [0, 4, 2]]

Comments

0
l=[[100,'XHS',0],
[100,'34B',3],
[100,'42F',1],
[100,'XHS',0],
[100,'34B',30],
[100,'42F',10],
[100,'XHS',0],
[100,'34B',300],
[100,'42F',100]]

def chunks(l, n):
    for i in xrange(0, len(l), n):
        yield l[i:i+n]

will print:

[[0, 3, 1], [0, 30, 10], [0, 300, 100]]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.