LinformerEncoder Layer: No Linearisation?

Hello,

Sorry if this is a silly question, but looking at your code in ptr_base.py line 90 the LinformerEncoder layer doesn't seem to be implementing linear attention at all; what it seems to be doing instead is just performing regular multi-head attention.  Is this the case, and if not where for the LinformerEncoder layers does the linearisation take place?

Thanks,

Josh