Learn it Up - Logs

12/05/2026:

13/05/2026:

14/05/2026:

15/05/2026:

IDEA: Consider training LSTM models using recurrent batch normalization and residual connections in LSTMs To consider this, must search on existing applications of these to LSTMs in other papers to consider results.

16/05/2026:

18/05/2026:

QUESTION: DanceDanceConv and DanceDanceConvLSTM both use an architecture simular to the encoder-decoder. Could doing something like reversing the input possibly improve the model like it does with seq2seq?

QUESTION: Could adding this attention encoder-decoder mechanism to DanceDanceConvolution help increase performance without adding the computational costs of ConvLSTM?

19/05/2026:

QUESTION: Honestly, the original approach is much simplier, but I see the appeal of the transformer. I wonder if you would get great performance in RNN attention based networks by adding residuals and/or normalization. Food for thought.

I’ve noticed a detail that I have glossed over in DanceDanceConvLSTM, that is the fact that it does utilize music information when doing step placement, which feels right with me. I shall also inspect the transformer based generation to determine if they do that.