4

I'm currently implementing a beat detection algorithm with python and numpy/scipy. I basically need to read a .wav file and process it. Here is the code:

sampling_rate, wave_data = scipy.io.wavfile.read(argv[1])

wave_data is a 1-D numpy array with about 441 000 elements (10 seconds of sound with 44.1 kHz sampling rate). Now, I need to do some basic math on every two elements in this array. This is how I do it now:

wave_data = [sampling_rate * (wave_data[i+1] - wave_data[i]) 
             for i in xrange(len(wave_data)-1)]

This opreation takes too much time (noticeable without profiling). I need to map the array pairwise "in-place", without creating a new python list. I know there is numpy.vectorize, but I don't know how can I do the mapping pairwise (map every two elements of the array).

1 Answer 1

4

Either of the following will do it:

wave_date = sampling_rate * np.diff(wave_data)

or

wave_date = sampling_rate * (wave_data[1:] - wave_data[:-1])

For example:

In [7]: sampling_rate = 2

In [8]: wave_data = np.array([1, 3, 5, 2, 8, 10])

In [9]: sampling_rate * (wave_data[1:] - wave_data[:-1])
Out[9]: array([ 4,  4, -6, 12,  4])

As far as performance is concerned, both of these approaches are about 500x faster than the list comprehension:

In [16]: wave_data = np.array([1., 3, 5, 2, 8, 10, 5, 2, 4, 7] * 44100)

In [17]: %timeit sampling_rate * np.diff(wave_data)
100 loops, best of 3: 2.2 ms per loop

In [18]: %timeit sampling_rate * (wave_data[1:] - wave_data[:-1])
100 loops, best of 3: 2.15 ms per loop

In [19]: %timeit [sampling_rate * (wave_data[i+1] - wave_data[i]) for i in xrange(len(wave_data)-1)]
1 loops, best of 3: 970 ms per loop
Sign up to request clarification or add additional context in comments.

2 Comments

I think you should mention that wave_date[1:] and wave_date[:-1] will create views for the existing array and thus they do not consume too much memory. Even though their difference does create a new array.
Work is turning me into a memory management freak, so I probably would have gone with wave_data[1:] -= wave_data[:-1]; wave_data *= sampling_rate; wave_data = wave_data[:-1] which may also actually be marginally faster because no array has to be created.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.