Skip to content

Non-record: hybrid spiking Transformer (SNN)with a multi-step spiking MLP#664

Open
tsbiosky wants to merge 1 commit intoopenai:mainfrom
tsbiosky:main
Open

Non-record: hybrid spiking Transformer (SNN)with a multi-step spiking MLP#664
tsbiosky wants to merge 1 commit intoopenai:mainfrom
tsbiosky:main

Conversation

@tsbiosky
Copy link

@tsbiosky tsbiosky commented Mar 25, 2026

Hybrid Spiking Neural Networks (SNNs) MLP

val_bpb: 1.2982 | 15.78 MB | 8×H100 SXM

A contest-friendly hybrid SNN submission built from the train_gpt.py baseline: keep dense GQA attention and the original training/eval/compression pipeline, but replace the standard feed-forward block with a small multi-step leaky integrate-and-fire (LIF-style) spiking MLP.

Reference :https://arxiv.org/pdf/2203.14679

Why this is interesting

This is not a fully spiking language model. It is a hybrid Transformer + SNN-MLP design:

  • embeddings, attention, residual path, and logits remain standard dense LM components
  • only the feed-forward block is replaced by a spiking mechanism
  • the original Parameter Golf training and export path stays intact

That makes the experiment meaningful for the contest setting because it isolates one question:

Can spike neural network achieves good performance in a tiny language model under a strict size budget?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant