From 7a008ae6731ac69e7e08ccfa517c8a1a02db9803 Mon Sep 17 00:00:00 2001 From: Robert Plummer Date: Sun, 28 Aug 2016 21:52:18 -0400 Subject: [PATCH 1/5] switch to a much more basic math example --- character_demo.html | 2278 ++++++++----------------------------------- 1 file changed, 427 insertions(+), 1851 deletions(-) diff --git a/character_demo.html b/character_demo.html index 80a61a0..f3dc50f 100644 --- a/character_demo.html +++ b/character_demo.html @@ -77,424 +77,6 @@ - - @@ -519,1439 +101,7 @@

Deep Recurrent Nets character generation demo

Input sentences:
- +
@@ -2022,6 +172,432 @@

Deep Recurrent Nets character generation demo

+ From 80dee368548653132706e322fee6c95d8b1829ea Mon Sep 17 00:00:00 2001 From: Robert Plummer Date: Sun, 28 Aug 2016 21:55:10 -0400 Subject: [PATCH 2/5] add math demo --- character_demo.html | 2278 +++++++++++++++++++++++++++++++++++-------- math_demo.html | 603 ++++++++++++ 2 files changed, 2454 insertions(+), 427 deletions(-) create mode 100644 math_demo.html diff --git a/character_demo.html b/character_demo.html index f3dc50f..80a61a0 100644 --- a/character_demo.html +++ b/character_demo.html @@ -77,6 +77,424 @@ + + @@ -101,7 +519,1439 @@

Deep Recurrent Nets character generation demo

Input sentences:
- +
@@ -172,432 +2022,6 @@

Deep Recurrent Nets character generation demo

- diff --git a/math_demo.html b/math_demo.html new file mode 100644 index 0000000..f3dc50f --- /dev/null +++ b/math_demo.html @@ -0,0 +1,603 @@ + + +RecurrentJS Sentence Memorization Demo + + + + + + + + + + + + +Fork me on GitHub + + +
+

Deep Recurrent Nets character generation demo

+
+ This demo shows usage of the recurrentjs library that allows you to train deep Recurrent Neural Networks (RNN) and Long Short-Term Memory Networks (LSTM) in Javascript. But the core of the library is more general and allows you to set up arbitrary expression graphs that support fully automatic backpropagation.

+ + In this demo we take a dataset of sentences as input and learn to memorize the sentences character by character. That is, the RNN/LSTM takes a character, its context from previous time steps (as mediated by the hidden layers) and predicts the next character in the sequence. Here is an example:

+ +
+ + In the example image above that depicts a deep RNN, every character has an associated "letter vector" that we will train with backpropagation. These letter vectors are combined through a (learnable) Matrix-vector multiply transformation into the first hidden layer representation (yellow), then into second hidden layer representation (purple), and finally into the output space (blue). The output space has dimensionality equal to the number of characters in the dataset and every dimension provides the probability of the next character in the sequence. The network is therefore trained to always predict the next character (using Softmax + cross-entropy loss on all letters). The quantity we track during training is called the perplexity, which measures how surprised the network is to see the next character in a sequence. For example, if perplexity is 4.0 then it's as if the network was guessing uniformly at random from 4 possible characters for next letter (i.e. lowest it can be is 1). At test time, the prediction is currently done iteratively character by character in a greedy fashion, but I might eventually implemented more sophisticated methods (e.g. beam search).

+ + The demo is pre-filled with sentences from Paul Graham's essays, in an attempt to encode Paul Graham's knowledge into the weights of the Recurrent Networks. The long-term goal of the project then is to generate startup wisdom at will. Feel free to train on whatever data you wish, and to experiment with the parameters. If you want more impressive models you have to increase the sizes of hidden layers, and maybe slightly the letter vectors. However, this will take longer to train.

+ + For suggestions/bugs ping me at @karpathy.

+ +
+
+
Input sentences:
+ +
+
+ +
Controls/Options:
+ + + + +
+ protip: if your perplexity is exploding with Infinity try lowering the initial learning rate +
+
+ +
+
Training stats:
+
+
Learning rate: you want to anneal this over time if you're training for longer time.
+
+
+
+ + +
+
+
+
+ +
+
+ +
Model samples:
+
+
+
Softmax sample temperature: lower setting will generate more likely predictions, but you'll see more of the same common words again and again. Higher setting will generate less frequent words but you might see more spelling errors.
+
+
+
+
+
+
Greedy argmax prediction:
+
+
+
+
I/O save/load model JSON
+ + + +
+ You can save or load models with JSON using the textarea below. +
+ + +
+
Pretrained model:
+ You can also choose to load an example pretrained model with the button below to see what the predictions look like in later stages. The pretrained model is an LSTM with one layer of 100 units, trained for ~10 hours. After clicking button below you should see the perplexity plummet to about 3.0, and see the predictions become better.
+ + +
+
+ + + From fc195616529c2a9dc909e0fa39299bf70b81bf25 Mon Sep 17 00:00:00 2001 From: Robert Plummer Date: Sun, 28 Aug 2016 21:58:01 -0400 Subject: [PATCH 3/5] add math demo --- math_demo.html | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/math_demo.html b/math_demo.html index f3dc50f..512107c 100644 --- a/math_demo.html +++ b/math_demo.html @@ -1,6 +1,6 @@ -RecurrentJS Sentence Memorization Demo +RecurrentJS Math Demo + + + + + + + + + + +Fork me on GitHub + + +
+

Deep Recurrent Nets math demo

+
+ This demo shows usage of the recurrentjs library that allows you to train deep Recurrent Neural Networks (RNN) and Long Short-Term Memory Networks (LSTM) in Javascript. But the core of the library is more general and allows you to set up arbitrary expression graphs that support fully automatic backpropagation.

+ + In this demo we take a dataset of random math characters as input and learn to memorize the math logic character by character. That is, the RNN/LSTM takes a character, its context from previous time steps (as mediated by the hidden layers) and predicts the next character in the sequence. Here is an example:

+ +
+ + In the example image above that depicts a deep RNN, every character has an associated "letter vector" that we will train with backpropagation. These letter vectors are combined through a (learnable) Matrix-vector multiply transformation into the first hidden layer representation (yellow), then into second hidden layer representation (purple), and finally into the output space (blue). The output space has dimensionality equal to the number of characters in the dataset and every dimension provides the probability of the next character in the sequence. The network is therefore trained to always predict the next character (using Softmax + cross-entropy loss on all letters). The quantity we track during training is called the perplexity, which measures how surprised the network is to see the next character in a sequence. For example, if perplexity is 4.0 then it's as if the network was guessing uniformly at random from 4 possible characters for next letter (i.e. lowest it can be is 1). At test time, the prediction is currently done iteratively character by character in a greedy fashion, but I might eventually implemented more sophisticated methods (e.g. beam search).

+ + The demo is populated with random math from javascript.

+ + For suggestions/bugs ping me at @karpathy.

+ +
+
+
Input sentences:
+ +
+
+ +
Controls/Options:
+ + + + +
+ protip: if your perplexity is exploding with Infinity try lowering the initial learning rate +
+
+ +
+
Training stats:
+
+
Learning rate: you want to anneal this over time if you're training for longer time.
+
+
+
+ + +
+
+
+
+ +
+
+ +
Model samples:
+
+
+
Softmax sample temperature: lower setting will generate more likely predictions, but you'll see more of the same common words again and again. Higher setting will generate less frequent words but you might see more spelling errors.
+
+
+
+
+
+
Greedy argmax prediction:
+
+
+
+
I/O save/load model JSON
+ + + +
+ You can save or load models with JSON using the textarea below. +
+ + +
+
Pretrained model:
+ You can also choose to load an example pretrained model with the button below to see what the predictions look like in later stages. The pretrained model is an LSTM with one layer of 100 units, trained for ~10 hours. After clicking button below you should see the perplexity plummet to about 3.0, and see the predictions become better.
+ + +
+
+ + + From 7ac113b89315ed1acd1fb06439aefde88b60fc96 Mon Sep 17 00:00:00 2001 From: Robert Plummer Date: Fri, 4 Nov 2016 11:42:50 -0400 Subject: [PATCH 5/5] added xor demo and https://github.com/harthur-org/rnn-viewer --- math_demo.html | 85 +++++++++++++++- rnn-viewer.js | 245 +++++++++++++++++++++++++++++++++++++++++++++++ src/recurrent.js | 1 - xor_demo.html | 82 ++++++++++++++++ 4 files changed, 411 insertions(+), 2 deletions(-) create mode 100644 rnn-viewer.js diff --git a/math_demo.html b/math_demo.html index 21c25b0..6742f1b 100644 --- a/math_demo.html +++ b/math_demo.html @@ -77,9 +77,13 @@ + + + +
Fork me on GitHub @@ -113,7 +117,7 @@

Deep Recurrent Nets math demo