Skip to content

Position encoding of edits  #3

@tlh24

Description

@tlh24

The model is currently poor at decoding the position of edits (versus the type and character, e.g. edit = insert 'c' at 5 sig type * char * pos). This may be because:

  1. There is a bug in the python model.
  2. Absolute position encoding is bad, and should use "more advanced decoding" like rotational encoding instead.
  3. There is a bug in ocaml batch generation.
  4. Training isn't long enough, or the model is too small.
  5. Programs have inherent invariances, which lead to ambiguity and training noise: order in addition doesn't matter, e.g.

To which I think:

  1. Definite possibility, need a positive control dataset?
  2. Also probably true, but want to punt on this
  3. Unlikely, as inspected via plot_mmap.py
  4. Also unlikely, the model has in the past memorized the datasets. See 'positive control' above.
  5. Very likely culprit, suggest http://arxiv.org/abs/1802.03685 as an demo of how to deal with intrinsic invariances. (also pertinent: https://arxiv.org/abs/1711.08028)

Super curious to others thoughts on this. My instinct is to turn the AST (or any graph) into a list of addresses, then use a transformer to encode this into positions to be fed to a larger, orthogonal transformer.

Basically: programs are graphs (or at minimum trees), so operating on them as lists is dumb, and i think we're already running into these limits.

Imagine that this has been described in the literature, but I'm not aware of anything?

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions