I see that you are using an "unsafe" softmax distribution on line 308 of parser.py. This line is likely to overflow if scores * alpha is large. There is a simple trick to make overflow in softmax impossible.
import numpy as np
def softmax(scores, alpha):
x = scores * alpha
x -= x.max()
np.exp(x, out=x)
x /= x.sum()
return x
print softmax(np.random.uniform(-300, 300, 10), 10)
This code is lightly optimized to reuse temporary arrays.