efficient-ai

Dynamic Attention Mask (DAM) generate adaptive sparse attention masks per layer and head for Transformer models, enabling long-context inference with lower compute and memory overhead without fine-tuning.

inference-optimization sparse-attention efficient-ai

Updated Jun 16, 2025
Python

yumozi / GUARD

Star

Official PyTorch implementation of the paper "Towards Adversarially Robust Dataset Distillation by Curvature Regularization" (AAAI 2025).

computer-vision efficiency robustness distillation dataset-distillation efficient-ai aaai2025

Updated Oct 21, 2025
Python

Shikha-code36 / early-exit-cnn

Star

A deep learning framework that implements Early Exit strategies in Convolutional Neural Networks (CNNs) using Deep Q-Learning (DQN). This project enhances computational efficiency by dynamically determining the optimal exit point in a neural network for image classification tasks on CIFAR-10.

reinforcement-learning deep-learning cnn pytorch dqn image-classification cifar10 cifar-10 pytorch-cnn cnn-pytorch cifar10-classification early-exit model-optimization efficient-ai

Updated Feb 23, 2025
Jupyter Notebook

raphischer / ai-energy-validation

Star

Ground-Truthing AI Energy Consumption: Validating CodeCarbon Against External Measurements

sustainability ai energy-consumption efficient-ai

Updated Sep 29, 2025
Python

paredezadrian / mocanet

Star

MOCA-Net: Novel neural architecture with sparse MoE, external memory, and budget-aware computation. Real Stanford SST-2 integration, O(L) complexity, 96.40% accuracy. Built for efficient sequence modeling.

deep-learning sentiment-analysis pytorch neural-networks research-tool external-memory mixture-of-experts sequence-modeling budget-optimization sst2 efficient-ai

Updated Aug 16, 2025
Python

afondiel / edge-language

Sponsor

Star

An open and practical guide to Edge Language

slm smol embedded-ai edge-ai edge-intelligence green-ai llm small-language-models frugal-ai efficient-ai ai-efficiency greener-ai

Updated Nov 14, 2025

arutovan-droid / symbion-trm-integration

Star

"TRM (Tiny Recursive Model) integration architecture for Symbion.space ecosystem"

trm symbiont reasoning-systems ai-orchestration efficient-ai tiny-recursive-model recursive-reasoning problem-structured-language geobench

Updated Oct 24, 2025

priyanshujiiii / awesome-Quantization

Sponsor

Star

In this repo you will understand .The process of reducing the precision of a model’s parameters and/or activations (e.g., from 32-bit floating point to 8-bit integers) to make neural networks smaller, faster, and more energy-efficient with minimal accuracy loss.

deep-learning neural-networks quantization zero-shot model-compression mixed-precision edge-ai hardware-aware data-free model-optimization quantization-aware-training post-training-quantization efficient-ai

Updated Aug 11, 2025

LumGenLab / LumGPT

Star

Production-grade GPT transformer implemented from scratch in C++. Runs on modest hardware with complete mathematical derivations and optimized tensor operations.

machine-learning deep-learning transformer cpp17 gpt language-model efficient-ai opensource-llm lumgenlab

Updated Nov 5, 2025
C++

abdulvahapmutlu / quantlab-8bit

Star

QuantLab-8bit is a reproducible benchmark of 8-bit quantization on compact vision backbones. It includes FP32 baselines, PTQ (dynamic & static), QAT, ONNX exports, parity checks, ORT CPU latency, and visual diagnostics.