Position encoding of edits 

The model is currently poor at decoding the position of edits (versus the type and character, e.g. edit = `insert 'c' at 5` sig ` type * char * pos`).  This may be because: 

1. There is a bug in the python model. 
2. Absolute position encoding is bad, and should use "more advanced decoding" like rotational encoding instead.
3. There is a bug in ocaml batch generation. 
4. Training isn't long enough, or the model is too small. 
5. Programs have inherent invariances, which lead to ambiguity and training noise: order in addition doesn't matter, e.g. 

To which I think: 
1. Definite possibility, need a positive control dataset? 
2. Also probably true, but want to punt on this
3. Unlikely, as inspected via plot_mmap.py
4. Also unlikely, the model has in the past memorized the datasets.  See 'positive control' above. 
5. Very likely culprit, suggest http://arxiv.org/abs/1802.03685 as an demo of how to deal with intrinsic invariances.  (also pertinent: https://arxiv.org/abs/1711.08028)

Super curious to others thoughts on this.  My instinct is to turn the AST (or any graph) into a list of addresses, then use a transformer to encode this into positions to be fed to a larger, orthogonal transformer.  

Basically: programs are graphs (or at minimum trees), so operating on them as lists is dumb, and i think we're already running into these limits.  

Imagine that this has been described in the literature, but I'm not aware of anything?  

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Position encoding of edits #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Position encoding of edits #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions