How can I convert spectrogram data to a tensor (or multidimensional numpy array)?

Question

I am using keras and have:

        corrupted_samples, corrupted_sample_rate = sf.read(
            self.corrupted_audio_file_paths[index])

        frequencies, times, spectrogram = scipy.signal.spectrogram(
            corrupted_samples, corrupted_sample_rate)

As per the docs, this gives:

f (ndarray) - Array of sample frequencies.
t (ndarray) - Array of segment times.
Sxx (ndarray) - Spectrogram of x. By default, the last axis of Sxx corresponds to the segment times.

I assume all of the times will line up, so I don't care about the value of the time (I don't think). The same is true of frequencies. So what I actually need is the values at each time for each frequency, which is given by Sxx (or spectrogram) in my code. I'm unsure how to actually do that. It seems simple though.

Sxx is a numpy array. It looks like your question is how to convert a numpy array to keras tensor. — Warren Weckesser
– Warren Weckesser, Commented Feb 17, 2020 at 15:56

wz 98 · Accepted Answer · 2020-02-17 03:21:02Z

2

Based on https://towardsdatascience.com/speech-recognition-analysis-f03ff9ce78e9, the author stated that the spectrogram is a spectro-temporal representation of the sound and show some of the steps of converting wav file to spectogram.

One of the example could be as below:

## Check the sampling rate of the WAV file.
audio_file = './siren_mfcc_demo.wav'


import wave
with wave.open(audio_file, "rb") as wave_file:
    sr = wave_file.getframerate()
print(sr)

audio_binary = tf.read_file(audio_file)

# tf.contrib.ffmpeg not supported on Windows, refer to issue
# https://github.com/tensorflow/tensorflow/issues/8271
waveform = tf.contrib.ffmpeg.decode_audio(audio_binary, file_format='wav', samples_per_second=sr, channel_count=1)
print(waveform.numpy().shape)

signals = tf.reshape(waveform, [1, -1])
signals.get_shape()

# Compute a [batch_size, ?, 128] tensor of fixed length, overlapping windows
# where each window overlaps the previous by 75% (frame_length - frame_step
# samples of overlap).
frames = tf.contrib.signal.frame(signals, frame_length=128, frame_step=32)
print(frames.numpy().shape)

# `magnitude_spectrograms` is a [batch_size, ?, 129] tensor of spectrograms. We
# would like to produce overlapping fixed-size spectrogram patches; for example,
# for use in a situation where a fixed size input is needed.
magnitude_spectrograms = tf.abs(tf.contrib.signal.stft(
    signals, frame_length=256, frame_step=64, fft_length=256))

print(magnitude_spectrograms.numpy().shape)

The method above is referring to https://colab.research.google.com/drive/1Adcy25HYC4c9uSBDK9q5_glR246m-TSx#scrollTo=QTa1BVSOw1Oe

Hope it can help you.

answered Feb 17, 2020 at 3:21

wz 98

933 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Shamoon Over a year ago

Thank you. I already have the spectogram from scipy.signal.spectrogram. I need to convert that to a tensor of (n_timesteps, n_frequencies) somehow

Travasaurus Over a year ago

Did you ever find a solution for that @Shamoon ?

Alankrit Over a year ago

I am trying to find a solution like that. Did anyone solve it?

Collectives™ on Stack Overflow

How can I convert spectrogram data to a tensor (or multidimensional numpy array)?

1 Answer 1

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related