keras STFT layers

I started implementing new keras layers at keras_STFT_layer repo.

What are these?

With these layers, you wouldn’t need to pre-compute and store STFT/Melgram/CQT in your hard drive. A new pipeline would be…

  • Store audio files as it is,
    • or perhaps decode them into raw wave (PCM) and store them in npy or hdf.
  • Start training!


The code would be

model = keras.Sequential()
specgram = Spectrogram(n_dft=512, n_hop=128, input_shape=(len_src, 1))
model.add(BatchNormalization(axis=time_axis)) # recommended

Would it be faster?

I will find out 🙂

How’s the quality of the conversion?



More info

Stay tuned to the keras_STFT_layer repo, there are code, ipython files, etc.


