31

I have two iterators, a list and an itertools.count object (i.e. an infinite value generator). I would like to merge these two into a resulting iterator that will alternate yield values between the two:

>>> import itertools
>>> c = itertools.count(1)
>>> items = ['foo', 'bar']
>>> merged = imerge(items, c)  # the mythical "imerge"
>>> merged.next()
'foo'
>>> merged.next()
1
>>> merged.next()
'bar'
>>> merged.next()
2
>>> merged.next()
Traceback (most recent call last):
    ...
StopIteration

What is the simplest, most concise way to do this?

2
  • 1
    Don't use this one folks: list((yield next(c)) or i for i in items) Commented Jul 12, 2017 at 12:21
  • 3
    This isn't what OP is looking for, but it's the first result upon googling "merge iterators python," so I figured I would comment: If you're looking for a mergesort-type function that merges two sorted iterators into one longer sorted iterator, use heapq.merge. Commented Dec 9, 2019 at 2:01

13 Answers 13

46

A generator will solve your problem nicely.

def imerge(a, b):
    for i, j in itertools.izip(a,b):
        yield i
        yield j
Sign up to request clarification or add additional context in comments.

10 Comments

You should add a disclaimer - this will only work if list a is finite.
Claudiu is correct. Try zipping two infinite generators--you will run out of memory eventually. I would prefer using itertools.izip instead of zip. Then you build the zip as you go, instead of all at once. You still have to watch out for infinite loops, but hey.
It will still only work if one of the arguments is a finite iterable. If they are both infinite, zip() won't work. Use itertools.izip() instead.
In Python 3.0 zip() behaves like itertools.izip().
Can somebody clarify for the noobs like me that we will be able to handle reading a finite number of elements out of two infinite generators if we use izip? e.g. This is the primary reason for izip to exist, yes?
|
16

You can do something that is almost exaclty what @Pramod first suggested.

def izipmerge(a, b):
  for i, j in itertools.izip(a,b):
    yield i
    yield j

The advantage of this approach is that you won't run out of memory if both a and b are infinite.

1 Comment

Quite correct, David. @Pramod changed his answer to use izip before I noticed yours, but thanks!
15

I also agree that itertools is not needed.

But why stop at 2?

  def tmerge(*iterators):
    for values in zip(*iterators):
      for value in values:
        yield value

handles any number of iterators from 0 on upwards.

UPDATE: DOH! A commenter pointed out that this won't work unless all the iterators are the same length.

The correct code is:

def tmerge(*iterators):
  empty = {}
  for values in itertools.zip_longest(*iterators, fillvalue=empty):
    for value in values:
      if value is not empty:
        yield value

and yes, I just tried it with lists of unequal length, and a list containing {}.

5 Comments

Does this exhaust each iterator ? I think zip will truncate to the shortest one. I'm looking for a merge that takes one from each iterator in turn, until each of them is exhausted.
How embarrassing. You are perfectly correct! See my improved code here.
No embarassment needed, your reply and quick response saved me hours of pain!
For python3 replace izip -> zip.
@Stefan: fixed, thanks. I wrote this two days after Python 3 was released! Now Python 2 is defunct.
12

I'd do something like this. This will be most time and space efficient, since you won't have the overhead of zipping objects together. This will also work if both a and b are infinite.

def imerge(a, b):
    i1 = iter(a)
    i2 = iter(b)
    while True:
        try:
            yield i1.next()
            yield i2.next()
        except StopIteration:
            return

2 Comments

The try/except here breaks the iterator protocol by muffling the StopIteration, doesn't it?
@David Eyk: it's OK, because returning from a generator raises StopIteration anyway. The try statement in this case is superfluous.
11

You can use zip as well as itertools.chain. This will only work if the first list is finite:

merge=itertools.chain(*[iter(i) for i in zip(['foo', 'bar'], itertools.count(1))])

2 Comments

Why do you have a restriction on the size of the first list?
It doesn't need to be so complicated, though: merged = chain.from_iterable(izip(items, count(1))) will do it.
6

I prefer this other way which is much more concise:

iter = reduce(lambda x,y: itertools.chain(x,y), iters)

1 Comment

add from functools import reduce in python 3 before running the line above
4

One of the less well known features of Python is that you can have more for clauses in a generator expression. Very useful for flattening nested lists, like those you get from zip()/izip().

def imerge(*iterators):
    return (value for row in itertools.izip(*iterators) for value in row)

3 Comments

Definitely would work, though I find nested generator expressions less than readable. I'd use this style if I were worried about performance.
It is really concise, as Python often is, but how does one begin to see what this code does? What is the effect of value for row in ... followed by for value in row? Isn't this a nested list-comprehension-generator? shouldn't it end with something like for rowvalue in row or is value shadowed?
@StevenLu Basically it's two nested loops, like this: for row in itertools.izip(*iterators): for value in row: yield value
3

I'm not sure what your application is, but you might find the enumerate() function more useful.

>>> items = ['foo', 'bar', 'baz']
>>> for i, item in enumerate(items):
...  print item
...  print i
... 
foo
0
bar
1
baz
2

1 Comment

I always forget about enumerate! What a useful little tool, though it won't work in my particular application. Thanks!
3

Here is an elegant solution:

def alternate(*iterators):
    while len(iterators) > 0:
        try:
            yield next(iterators[0])
            # Move this iterator to the back of the queue
            iterators = iterators[1:] + iterators[:1]
        except StopIteration:
            # Remove this iterator from the queue completely
            iterators = iterators[1:]

Using an actual queue for better performance (as suggested by David):

from collections import deque

def alternate(*iterators):
    queue = deque(iterators)
    while len(queue) > 0:
        iterator = queue.popleft()
        try:
            yield next(iterator)
            queue.append(iterator)
        except StopIteration:
            pass

It works even when some iterators are finite and others are infinite:

from itertools import count

for n in alternate(count(), iter(range(3)), count(100)):
    input(n)

Prints:

0
0
100
1
1
101
2
2
102
3
103
4
104
5
105
6
106

It also correctly stops if/when all iterators have been exhausted.

If you want to handle non-iterator iterables, like lists, you can use

def alternate(*iterables):
    queue = deque(map(iter, iterables))
    ...

1 Comment

An interesting approach. :) So many ways to do this. I wonder if a rotating deque() would be more efficient than rebuilding the tuple on every iteration?
1

Use izip and chain together:

>>> list(itertools.chain.from_iterable(itertools.izip(items, c))) # 2.6 only
['foo', 1, 'bar', 2]

>>> list(itertools.chain(*itertools.izip(items, c)))
['foo', 1, 'bar', 2]

Comments

1

A concise method is to use a generator expression with itertools.cycle(). It avoids creating a long chain() of tuples.

generator = (it.next() for it in itertools.cycle([i1, i2]))

Comments

0

Why is itertools needed?

def imerge(a,b):
    for i,j in zip(a,b):
        yield i
        yield j

In this case at least one of a or b must be of finite length, cause zip will return a list, not an iterator. If you need an iterator as output then you can go for the Claudiu solution.

1 Comment

I prefer an iterator, as I'm reading values from files of arbitrary size. I'm sure there are cases where zip is superior.
0

Using itertools.izip(), instead of zip() as in some of the other answers, will improve performance:

As "pydoc itertools.izip" shows:

Works like the zip() function but consumes less memory by returning an iterator instead of a list.

Itertools.izip will also work properly even if one of the iterators is infinite.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.