From 6e367557e3b29f8d91c53c4dda19e60fb8c1948f Mon Sep 17 00:00:00 2001 From: Dominic Cerisano Date: Thu, 17 Aug 2023 20:00:31 -0400 Subject: [PATCH] Fixed broken shakespeare_with_tpu_and_keras.ipynb example Not sure why this example is now broken on Colab. Thinking version control problem between numpy and tensorflow. In the current version of the example, prediction breaks on the second prediction because it is expecting an array of single ascii value arrays (like the first seeded input). However the array that is appended for the next prediction is just an array of ascii values, which causes the predict function to throw an error. Also, the printing section is expecting an array of ascii values instead of the array of single value ascii arrays used for storing predictions. The fixes are minor, simply making sure each character is created as a single ascii value array, and then indexed that way during printing of results. Side note: I got much better results by increasing (4X) the number of epochs, step size and embedding dimensions during training. Results in a much higher resolution weights file that produces much more cogent outputs. --- tools/colab/shakespeare_with_tpu_and_keras.ipynb | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/colab/shakespeare_with_tpu_and_keras.ipynb b/tools/colab/shakespeare_with_tpu_and_keras.ipynb index 8f35dd5fc..36b83ba53 100644 --- a/tools/colab/shakespeare_with_tpu_and_keras.ipynb +++ b/tools/colab/shakespeare_with_tpu_and_keras.ipynb @@ -423,7 +423,7 @@ " \n", " # sample from our output distribution\n", " next_idx = [\n", - " np.random.choice(256, p=next_probits[i])\n", + " [np.random.choice(256, p=next_probits[i])]\n", " for i in range(BATCH_SIZE)\n", " ]\n", " predictions.append(np.asarray(next_idx, dtype=np.int32))\n", @@ -432,7 +432,7 @@ "for i in range(BATCH_SIZE):\n", " print('PREDICTION %d\\n\\n' % i)\n", " p = [predictions[j][i] for j in range(PREDICT_LEN)]\n", - " generated = ''.join([chr(c) for c in p]) # Convert back to text\n", + " generated = ''.join([chr(c[0]) for c in p]) # Convert back to text\n", " print(generated)\n", " print()\n", " assert len(generated) == PREDICT_LEN, 'Generated text too short'"