Release; pre-trained convnet for music auto-tagging

Advertisements

12 thoughts on “Release; pre-trained convnet for music auto-tagging

      1. In a previous post it said that the model was trained on ~29s audio from the OMS dataset. Is that correct and if so, I couldn’t find the code? Thanks!

        Like

  1. Hi,

    I downloaded GTZAN Music genre dataset from http://marsyasweb.appspot.com/download/data_sets/?_sm_au_=i7HSSSWqdVMd13T7.
    I converted the GTZAN dataset from 22050hz to 16000 hz sampling rate using sox. (ex: sox inputfile.wav -b16 -r16000 out.wav)
    When I ran the example tagging script with audio files from GTZAN/rock directory, most of the predictions are showing it as jazz.
    What am I doing wrong? (Using CRNN with Theano)

    regards
    Srinidhi

    Like

    1. I’d recommend you to use it as a feature extractor and add a classifier on the top of it, rather than use the result as it is.

      Like

    2. So you recommend me to build a new trained model with my training data and then test it against GTZAN dataset.
      Why the uploaded pretrained weights are giving wrong results with GTZAN dataset.
      Thanks
      Srinidhi

      Like

  2. Yes, I tested it with a similar network. It will get you 70-80% of accuracy. It is not the problem of gtzan. The current CRNN weights are kinda weird, it makes sense with AUC evaluation scheme though. (AUC is not about top-K prediction.) I’m planning to update it.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s