Update 20 Apr 2016: Check out the paper on arXiv (PDF)

LSTM Realbook

Summary

Char-RNN and Word-RNN with Keras (LSTM text generation example) and Theano for automatic composition of jazz score. 2,846 Realbook jazz songs from a website were used for training after a conversion into a text file. Band-in-a-box format files were converted into a text file using java code written by Simon Dixon and Matthias Mauch, then learned as usual. The result is fun – you can listen to 5 LSTM Realbook songs!

A Quick look at things

LSTM

LSTM (Long Short-Term Memory) is a type of RNN. It is known to be able to learn a sequence effectively.

RNN

RNN (Recurrent Neural Network) is a type of deep learning neural network. See this post by WildML for further understanding.

Char-RNN

Character-based RNN introduced by Andrej Karpathy is literally an RNN that predicts a character. Character space can be significantly smaller than word space, resulting in efficient learning in terms of both memory and computation. Please enjoy enormous examples in the original post – Shakespeare, Wikipedia, Linux source code, baby names, etc.

Keras

Keras is a deep learning framework based on Theano and Tensorflow. I used Theano as backend but this shouldn’t affect the output.

Realbook

Realbook is a collection of jazz standard songs with title, composer, chord, theme melody, and rhythm. As far as I know there are Realbook 1, 2, 3, and New Realbook 1,2,3, the Fakebook, the real Latin book, etc.

Band-in-a-box

Band-in-a-box is a software of PGMusic and has quite a long history – since 1990! Literally it’s a band-in-a-box for you. It accompanies, improvises solos, play the theme, just as your band except it doesn’t hear what you play (or make mistakes).

The procedure

Conversion to some known formats.

I don’t know how the band-in-a-box score file (.mgu) is structured. I used a java code written by Simon Dixon and Matthias Mauch. As a result, I got files like this – a .xlab format:

1:1 0.0 8 2.8235294117647056 C:maj C C
3:1 2.8235294117647056 8 5.647058823529411 G:9 G9 C
5:1 5.647058823529411 4 7.0588235294117645 C:9 C9 C
6:1 7.0588235294117645 4 8.470588235294118 C:7 C7 C
7:1 8.470588235294118 4 9.88235294117647 F:maj F C
8:1 9.88235294117647 4 11.294117647058822 F:min7 Fm7 C
9:1 11.294117647058822 4 12.705882352941176 C:maj C C
10:1 12.705882352941176 4 14.117647058823529 G:min7 Gm7 C
11:1 14.117647058823529 4 15.529411764705882 A:7 A7 C

.lab format is explained here, and .xlab is an extension of .lab. The format is (seems):
bar:measure start_time beat end_time chord chord key

Preprocess

To decreases number of cases, I kind of normalised the score – so that every score is in C key. I used <key> information in .xlab format and transposed every chord.

Between the two chord, I chose the first one – e.g. C:maj, G:7, C:sus4(b7).

There are more than one chords for a bar (e.g. see bar 22 in the band-in-a-box screenshot above). I decided to fill every blank beat with the corresponding chord. For example

| A:7 . . . | A:7 . C:maj . |

is converted into…

| A:7 A:7 A:7 A:7 | A:7 A:7 C:maj C:maj |

but without bars. (Perhaps I should have put the bar in the training file..?)

Every score is converted into a sentence with a start flag and end flag. A two-bar score would be represented as below:

_START_ A:7 A:7 A:7 A:7 A:7 A:7 C:maj C:maj _END_

See some numbers of the data

2,486 scores
1,259 unique chords
39 characters (including C,D,E,F,G,A,B,#,b,:,m,a,j,(m,)i,n,s,u(,s), …)
1,239.78 characters per score on average
Maximum 10,151 characters in a score
3,531,261 characters in total
545,300 words in total

# The text file is included in the repo.

Char-RNN and Word-RNN

Char-RNN : predicting a character given N characters. Character space is 39 dimensions.

Word-RNN : predicting a word given N words. Word == Chord name here, and word space is 1,259 dimensions.

Word space is bigger but a word (chord) has more meaning. Also, a sentence (score) requires less number of words than characters. It means the LSTM doesn’t need to recall old things when it’s dealing with words. However, I just used the same structure and hyperparameters. Which is as below:


model = Sequential()
model.add(LSTM(512, return_sequences=True, input_shape=(maxlen, num_chars)))
model.add(Dropout(0.2))
model.add(LSTM(512, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(512, return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(num_chars))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')

which is super arbitrary.

Result 1 – Char-rnn

So I could get some chord progressions. I did something tedious to make posting more interesting – sequencing midi, copy&paste apple loop, and add some bass note for a more pleasing experience. Chords with red are what you listen.

Char-rnn, diversity 0.5, iteration: 1 (LSTM Realbook 1)

Seed: maj C:maj C:maj C:ma

Result: maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj

With a low diversity it is so boring. II – V – I progression is not fun! My take away from this result is that I and V chords are the most common in music.

Char-rnn, diversity 1.2, iteration: 1 (LSTM Realbook 2)

Seed: maj C:maj C:maj C:ma

Result: maj C:maj C:maj C:maj G#:min6 G#:min D#:7 D#:7 D:min7 C:maj/5 C:maj/5 C:maj/5 C:maj/5 C:maj/5 C:maj/5 C:maj/5 C:maj6/5 E:min7/4 E:min7/4 C:#ds(s11) C:sus4 D#:maj6 A#:maj A#:maj A#:maj A#:maj A:7 A:7 A:7 A:7 A:7 D:min7 D:min7 D:min7 D:hdim D:hdim C:hdim C:hdim C:hdim C:hdim C:hdim C:hdim C:hdim G:9 G:9 D:min7 D:min7 D#:dim G#:di9 G#:min7 G#:min7 G#:min7 G:sus4(b7) G:sus4(b7) G:sus4(b7) G:sus4(b7) G:sus4(b7) G:sus4(b7) G:sus4(b7) G#:9(s51) D:min D:min G:7/5 G:7/5 C:maj C:maj C:maj C:maj G:7(b9) G:7(b9) C:6(9) C:6(9) C:6(9) C:6(9) C:6(9) C:6(9) C:6(9) C:6(9) C:6(9) C:6(9) D:maj7 D:maj7 G:maj(b7,b9,11,13) E:maj(b7,b9,11,13) E:9 E:9 E:9 E:9 E:9(s5,*5) E:9(b5,*5) E:min(b7,9,11) E:min(b7,9,11) E:min(b7,9,11) E:min7 E:min7 D:min6 D:min6 D:min6 A:min(b6,9,11) A:maj(b7,9,11,13) D:min7 D:min7 D:min7 A:9 A:min6 A:min A#:maj C#:min C#:min G#:maj G#:maj G#:maj G:9 G:7 G:7 G:7 G:7 G:7 G:7 G:7 C:sus4(b7,9) F:maj6 F:maj6 F:maj6 F:maj F:maj A:maj A#:maj A:hdim A:dim :mdn7 D:min7 D:min7 G:maj(b7,b9,11,13) C:6 C:9 F:9 A#s5us4(b7) A#:sus4(b7) B:sus4(b9) B:sus4(b7,9) B:sus4(b7,b5,49) B:sus4(b7) B:sus4(b7) B:sus4(b7) F:maj7 F:maj7 F:maj7 F:maj7 F:maj7 F:maj7 F:maj7 F:maj7 F:maj7 D#:maj7 D:dim D:dim F:maj7 F:maj6 D:min7 D:min7 C:maj C:maj F:maj F:maj F:maj F:aug F:7 F:7 F:7 C:7 G:min D:hdim G:sus4(b7,9) G:sus4(b7,9) C:maj7 C:maj7 C:maj7 C:maj7 C:maj7 F:maj9 F:maj7 F:maj7 F:maj7 F#:hdim F#:hdim F:min7 E:7(s5,*5) A:7(s9) A:7(s9) A:7(b9) A:7(b9) A:7(b9) D:min7 D:min7 E:min(7) E:min(7) E:min7 A:min7 A:min7 A:min7 A:min7 F:

One iteration results in kind of okay progression.
As each chord represent one beat, usually one chord should continue for 4 times. It is not obvious in this example.
Not II-V-I progression yet.

Char-rnn, diversity 0.5, iteration: 22

Seed: aj C:7/5 C:7/5 C:7 C

Result: aj C:7/5 C:7/5 C:7 C:7 F:maj F:maj F:min F:min F:min F:min C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj G:7 G:7 G:7 G:7 C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj

With enough iteration, the diversity of 0.5 looks still the same – So many C major, few G7, then comes back to C major.

Char-rnn, diversity 1.2, iteration 22 (LSTM Realbook 3)

Seed: aj C:7/5 C:7/5 C:7 C

Result: aj C:7/5 C:7/5 C:7 C:7 F:maj6 F:maj6 F#:dim F#:dim C:6(9) C:6(9) C:6(9) C:6(9) C:6(9) C:6(9) C:6(9) C:6(9) C:6(9) C:6(9) C:6(9) C:maj E:7(b9) E:7(b9) E:7(b9) E:7(b9) A:min(6,9) A:min(6,9) A#:min(6,9) A#:min(6,9) A#:min(6,9) A#:min(6,9) A#:min(6,9) A#:min(6,9) A:(1,b3,b5) A:(1,b3,b5) G#:(1,b3,b5) G#:(1,b3,b5) G:min(b7,9,11) G:min(b7,9,11) G:min(b7,9,11) G:min(b7,9,11) G#:min(b7,9,11) F#:min(b7,9,11) F#:min(b7,9,11) C:min9 C:min9 B:7(s5,*5,s9) B:7(s5,*5,s9) B:7(s5,*5,s9) C:6(9) C:6(9) C:6(9) A:7 A:7 D:min7 D:min7 D:min7/9 D:min(b7,9,11)/b5 E:min(b7,9,11)/b3 E:min7(s5,*5)/b7 E:min(b7,9,11)/b7 E:min(b7,9,11)/b7 E:min(b7,9,11)/b7 E:min7 E:min7 E:min7 E:min7 A:7(b9) A:7(b9) D:min D:min D:min D:min G:9 G:9 C:maj C:maj F:9 F:9 F:9 F:9 C:maj6 C:maj6 C:maj6 C:min6 C:min6 F:9 F:9 F:9 F:9 D:min7 D:min7 D:min7 D:min7 D:min7 C#:maj(b7,9,11,13)/b7 G#:maj(b7,9,11,13) G#:maj(b7,9,11,13) C:maj7 C:maj7 E:min7 E:min7 D#:min7 D#:min7 D#:min7 E:min7 A:7(b9) A:7(b9) D:min D:min D:min D:min D:min D:min D:min D:min _END_ _START_ A:min A:min F:maj F:maj B:7 B:7 E:maj E:maj E:maj B:9 B:9 B:maj B:maj D:min(b7,9,11) D:min(b7,9,11) D:min(b7,9,11) D:min(b7,9,11) D:min(b7,9,11) G:aug(b7,9) G:aug(b7,9) G:min7 G:min7 E:aug(b7) A:7 A:7 A:7 A:7 A:7 D:min7 D:min7 D:min7 D:min7 C#:maj/2 C#:maj/2 C#:maj/2 D:aug(b7) D:7 F:maj F:maj G:maj G:maj C:maj C:maj A#:maj A:7 A:7 A:7 A:7 D:min7 E:dim F:min6 D:hdim D:hdim D:hdim D:hdim D:hdim D:hdim D:hdim D:hdim G:7(b9) G:7(b9) G:7(b9) G:7(b9) G#:7 G#:7 E:hdim E:hdim E:hdim E:hdim E:hdim A:7 A:

It learned _START_ is always with _END_
After _START_, it doesn’t start with C but with A! Although I ‘normalised’ the scores there were many excerpts that don’t begin with C.
Do you think it learned 4-times repetition thing to have one chord for one bar? I’m not sure.
Now I see II-V-I! Dm-G9-CM.

Result 2 – Word-rnn

Word-rnn, diversity 0.5, iteration 1

C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj
C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 C:maj6

Word-rnn, diversity 1.2, iteration 1

C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj
C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj _END_ F:min G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 C:maj C:maj C:7 C:7 F#:dim C:maj6 _START_ _START_ C#:maj A#:min A:sus4/5 C:maj/3 F:min7 A:min7 D:min7 D:min7 G:sus4(b7,9) D:aug(b7) G:9 F:maj/2 G:7(s5,*5) G:9 G:7 G:sus4(b7,s5,*5)/b5 C:6(9) C:maj7 C:maj7 G:7 G:7 C:maj7 D:min7 C:maj6 E:min7 G:sus4(b7) G:9 C:maj6 C:maj6 E:hdim C:maj6 _START_ D:min7/b7 G:9/2 D:min7/b7 D:min7 D:min7 G:7 A:dim F:min6/5 G:7 G:7 G:7 G:7 G:7 D:min7 D:min7 G:7 G:7 G:dim G:7 G:7 F:min6 G:maj6 D:min7 D:min7 C:dim F#:hdim B:7 E:min7 B:7(b9) D:maj6 D:9 D:9 C:7(b9) G:maj/2 D:min D:min7 D:min7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 C:maj G:7 D:min7 D:min7 D:min7 G:7 G:9 D:min7 D:min7 G:7 G:7 D:min7 G:7 A:7(b9) A:7 A:min7 A:9 D:min D:min7 D:min7 D:min7 D:min7 C:maj C:maj C:7 C:7 F:maj6 G:7 G:7 G:7 D:min7/4 F:aug F:min6 D:sus4 D#:aug(b7) G:min6 C:maj/3 D:min F:min6 G:9 G:9 _START_ G:9 D:min7 D:min7 D:min7 D:min7 D:min7

Obviously it doesn’t know the meaning of _START_ and _END_.

Word-rnn, diversity 0.5, iteration 8 (LSTM Realbook 4)

G:7(b9) C:maj C:maj A:min A:min D:min7 D:min7 G:7(b9) G:7(b9) C:maj C:maj C:maj C:maj A:min7 A:min7 A:min7 A:min7 D:9 D:9 D:9
G:7(b9) C:maj C:maj A:min A:min D:min7 D:min7 G:7(b9) G:7(b9) C:maj C:maj C:maj C:maj A:min7 A:min7 A:min7 A:min7 D:9 D:9 D:9 D:9 D:9 D:9 D:9 D:9 D:7 D:7 D:7 D:7 D:min7 D:min7 D:min7 D:min7 G:7 G:7 G:7 G:7 C:maj C:maj C:maj C:maj C:7 C:7 C:7 C:7 F:maj F:maj F:maj F:maj F:min F:min F:min F:min C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj D:7 D:7 D:7 D:7 G:7 G:7 G:7 G:7 C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj G:7 G:7 G:7 G:7 G:7 G:7 G:7 G:7 C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj

Looks like it learned it’s good to fill a bar with the same chord.

Word-rnn, diversity 1.2, iteration 8 (LSTM Realbook 5)

G:7(b9) C:maj C:maj A:min A:min D:min7 D:min7 G:7(b9) G:7(b9) C:maj C:maj C:maj C:maj A:min7 A:min7 A:min7 A:min7 D:9 D:9 D:9
G:7(b9) C:maj C:maj A:min A:min D:min7 D:min7 G:7(b9) G:7(b9) C:maj C:maj C:maj C:maj A:min7 A:min7 A:min7 A:min7 D:9 D:9 D:9 E:hdim/b5 D:9 D:9 D:9 D:9 D:min7 D:min7 D:min7 D:min7 G:7 G:7 G:7 G:7 C:maj6/6 F#:dim C:9 C:9 F:maj6 F:maj6 C:7 C:7 F:maj6 F:maj6 F:maj6 F:maj6 A#:maj6 A#:maj6 A#:maj6 A#:maj6 C:maj/3 C:maj/3 C:maj/3 C:maj/3 C:maj C:maj C:maj C:maj C:maj C:maj C:maj C:maj A#:9 F:maj7 A#:7 A#:7 C:maj C:maj C:maj/3 C:maj/3 F:maj6 F:maj6 C:maj C:maj C:maj C:maj C:maj C:maj G:min7 G:min7 G:min7 G:min7 F:maj F:maj F:maj F:maj D:min7 D:min7 D:min7/4 G:sus4(b7) G:min9 G:min9 G:min9 F#:(1,3,b5,b7,9,13) C:6(9) G:sus4(b7,9) E:7(s5,*5,s9) E:7(s5,*5,s9) A:min9 A:min9 A:min9 A:min9 D:min7 D:min7 D:7(b9) D:9 G#:9(s11) G:(1,4,5,b7,9,11,13) G:aug(b7,9) G:9(s5,*5) C:maj6 C:maj6 E:min7(s5,*5)/b7 A:7(s9,s11,b13) D#:min9 D:min9 F:maj(b7,9,11,13) F:maj(b7,9,11,13) B:min9 D#:min9 D:min9 D:min9 C:6(9) C:6(9) B:hdim B:hdim A:min7 A:min7 G#:7(s9) A:min7/4 B:hdim E:7(s5,*5,b9) E:aug(b7) E:aug(b7) A:min7 A:min7 D:maj(b7,9,11,13) D:maj(b7,9,11,13) D:min D:min D:min(7) D:9(b5,*5) A#:7 A#:7 A#:7 A#:7 B:min7 B:min7 E:7 E:7 D:7 D:7 D:7 D:7 D:7 D:7 D:7 D:7 G#:9 G#:9 C#:maj7 C#:maj7 B:maj B:maj D#:sus4 F#:maj6 B:maj7 B:maj7 A#:maj7 A#:maj7 A#:maj7

Code

The Code and the training text file are provided in my repo. Enjoy!

6 Comments

Pingback: LSTMetallica | Keunwoo Choi
Pingback: Paper is out: Text-based LSTM networks for Automatic Music Composition | Keunwoo Choi
Gene (@FYJLuo) says:

May 17, 2018 at 5:22 am

Hi Keunwoo, it is surprising to see how such a perspective from text processing works in music.
So did you consider word embedding instead of one-hot encoding? As there are more than one thousand vocabularies in the word-RNN, it might be a good idea to use a word embedding; I tried it but didn’t make comparison with your setting (without word embedding).

And I think it might be helpful to rule out rare chords such as “7(b5,*5,s9)” whose number of occurrences is only 32 in the corpus (compared to “maj”, with 130411 n. occurrences); by the way, could I ask what does the notation (b, *, s) mean in the chord?

Thanks again for the interesting work, and the tedious editing of the generated chords indeed sound much more pleasant 🙂

LikeLike

1. Gene (@FYJLuo) says:
  
  May 21, 2018 at 3:27 am
  
  Oh, for the description of chord notation I missed your link, sorry for not paying attention.
  The definition of the chord notation is based on Harte et al., “Symbolic Representation of Musical Chords: A Proposed Syntax for Text Annotations”, 2005.
  
  LikeLike
  
Kian says:

July 16, 2018 at 3:21 am

Hi Keunwoo- Awesome article! Read through your paper too, and it was very enjoyable.

Quick question- what software did you use to convert your chords into audio? I’m working on a similar model based on Markov Chains, and would love to convert my chords into audio that I can listen to.

LikeLike

Karthick veerakumar says:

November 22, 2018 at 6:54 am

Hey Keunwoo, the blog was awesome except I had a few doubt, how can I write the notes back into a music file which can later be played. Appreciate your help here 🙂

LikeLike

LSTM Realbook: Generation Jazz chord progressions

LSTM Realbook

Summary

A Quick look at things

LSTM