Conversation
|
An example of usage of the layer |
|
But it works. It is more a question on what is happening behind the element research framework... to get it work as fast as the original version, without memory leak. Could someone advise me on these questions : 1° how does garbage collection work ? training multiple forward / backwards does delete tensors ? shall I call forget method during each step of the training ? 2° if I want to initialize the parameters (https://github.com/christopher5106/grid-lstm/blob/master/train.lua#L163-L180) inside the layer, in which function shall I put it ? Thanks a lot |
|
Using rnn:forget() solves the memory and speed issue. |
|
Small corrections. Works perfectly now. |
|
@christopher5106 Is it too l ate for me to ask you to include documentation and unit test? Sorry for the delay. |
|
Hello guys! Any updates on this PR? Looks tasty! |
|
What would you like exactly for this? |
|
@christopher5106 For documentation, adding a section to README.md with link to paper and brief explanation should do the trick. For unit tests, add a function to test.lua to make sure GridLSTM behaves as expected. Doesn't have to be extensive. |
| self.cells = {[0] = {}} | ||
|
|
||
| for L=1,self.nb_layers do | ||
| local h_init = torch.zeros(input:size(1), self.outputSize):cuda() |
There was a problem hiding this comment.
You will have to fix this option so it can work without a gpu, i'm getting errors on cpu version
There was a problem hiding this comment.
I tried removing cuda() references but could not get it to work on the rnn-sin demo with 4x2 tensor/table
I added 2D Grid LSTM following https://github.com/coreylynch/grid-lstm
It is pretty slow and I get an out of memory error. Sounds I'm not sure to understand all aspects of Element-research/rnn...