AAIRM

Agentic AI Inventory Replenishment and Management
Multi-category retail inventory optimization with coordinated agentic decision-making.

AAIRM is a research-oriented framework that combines demand forecasting, replenishment optimization, supplier-aware execution, and governance controls in one end-to-end workflow. It is designed for reproducible benchmarking, multi-category experimentation, and publication-ready analysis.

✨ Highlights

🤖 Agentic inventory optimization: coordinated perception, conceptualization, and action layers.
🛒 Multi-category setting: unified simulation over grocery, frozen_food, apparel, cosmetics, and dry_fruits.
⚖️ Cost-service trade-off learning: lower normalized cost while maintaining competitive service metrics.
📈 Scalability validation: controlled scaling from 100 SKU to 500 SKU settings with fixed protocols.
🔬 Research-ready workflow: reproducible seeds, ablations, benchmark baselines, and structured outputs.

🏗️ System Overview

AAIRM organizes decision-making into specialized components:

Perception agents: ingest demand signals, supplier behavior, and environment state.
Conceptualization agents: produce policy-level decisions (forecasting, constraints, and replenishment intent).
Action agents: execute procurement and inventory actions through tools and ERP-compatible interfaces.
Governance infrastructure: audit ledger, health monitoring, and reputation signals to constrain unsafe actions.

Core package layout:

aairm/agents/ for multi-agent orchestration and role-specific logic.
aairm/models/ for forecasting and reinforcement learning modules.
aairm/simulation/ for environment, supplier, and demand simulation.
aairm/evaluation/ for benchmark metrics, reporting, and experiment summaries.

📊 Results Summary

Main Results (100 SKUs, 10 Seeds, 200 Episodes)

Primary experiment output: experiments/results/main_100sku_10seed/summary.json

Metric	AAIRM	Baseline1 (ROP-EOQ)	Baseline2 (ML+Static)
Stockout Rate	0.0771 +/- 0.0078	0.0119 +/- 0.0031	0.0486 +/- 0.0377
Fill Rate	0.9229 +/- 0.0078	0.9881 +/- 0.0031	0.9514 +/- 0.0377
Avg Inventory	5.0660 +/- 0.1618	7.1025 +/- 0.2562	7.4146 +/- 1.7718
Total Cost (normalized)	0.8679 +/- 0.0141	1.0000 +/- 0.0000	1.1321 +/- 0.1178
Spoilage Rate	0.0456 +/- 0.0041	0.0585 +/- 0.0054	0.0558 +/- 0.0144

Cost improvement: AAIRM improves normalized total cost by ~23.3% vs Baseline2 and ~13.2% vs Baseline1.

Scalability Results (500 SKUs, 5 Seeds, 200 Episodes)

Secondary output: experiments/results/scalability_500sku_5seed/summary.json

At 500 SKUs, AAIRM preserves a clear cost advantage (0.8292 vs 1.2033 for Baseline2). Service quality declines in harder high-perishable and volatile segments (notably dry_fruits), reflecting an explicit cost-service trade-off under higher scale rather than a pipeline failure.

🧠 Multi-Category Behavior

AAIRM is evaluated on five balanced retail categories:

grocery
frozen_food
apparel
cosmetics
dry_fruits

Observed behavior:

Perishability gradient: apparel shows near-zero spoilage; dry_fruits has consistently higher spoilage pressure.
Demand heterogeneity: category-specific dynamics induce different service and inventory patterns.
Adaptive policy posture: decisions vary by category to reduce aggregate holding burden while controlling total cost.

🛠️ Installation

Option A: Minimal runtime setup

python -m venv .venv
.\.venv\Scripts\activate
pip install -r requirements.txt

Option B: Editable package install

python -m venv .venv
.\.venv\Scripts\activate
pip install -e .

Option C: Full development environment

python -m venv .venv
.\.venv\Scripts\activate
pip install -e ".[dev]"
pre-commit install

Python 3.10+ is required.

🚀 Quickstart

Run main experiment (100 SKUs)

python scripts/run_smoke_multiseed.py `
  --seeds 42,43,44,45,46,47,48,49,50,51 `
  --episodes 200 `
  --n-skus 100 `
  --out-dir experiments/results/main_100sku_10seed

Run scalability experiment (500 SKUs)

python scripts/run_smoke_multiseed.py `
  --seeds 42,43,44,45,46 `
  --episodes 200 `
  --n-skus 500 `
  --out-dir experiments/results/scalability_500sku_5seed

🧪 Reproducibility & Experiments

Fixed seeds are used for benchmark consistency.
Baselines include ROP-EOQ and ML+Static policies.
Reproduction and ablation scripts are provided under experiments/ and scripts/.

Useful entry points:

experiments/run_paper_experiment.py
experiments/run_ablation.py
experiments/run_realworld.py
scripts/run_smoke_multiseed.py

🧰 Development Commands

If you use Make, common targets include:

make install-dev
make lint
make format
make typecheck
make test-fast
make docs

On Windows without Make, run equivalent commands directly (ruff, black, mypy, pytest, mkdocs).

📁 Repository Structure

aairm/                  # Core framework (agents, models, simulation, evaluation, tools)
configs/                # Experiment and dataset configuration files
scripts/                # Automation scripts (data prep, smoke runs, exports)
experiments/            # Paper reproduction and ablation runners
docs/                   # MkDocs documentation source
tests/                  # Unit, integration, and smoke tests
README.md

📖 Documentation

Project docs: https://aliakarma.github.io/AAIRM
Local docs server:

mkdocs serve

🤝 Contributing

Contributions are welcome.

Fork the repository.
Create a feature branch.
Run linting/tests locally.
Open a pull request with a clear change summary.

Please review CONTRIBUTING.md and CODE_OF_CONDUCT.md before submitting changes.

📌 Citation

If you use AAIRM in academic or industrial research, please cite using the metadata in CITATION.cff.

📜 License

This project is licensed under the MIT License. See LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AAIRM

📚 Index

✨ Highlights

🏗️ System Overview

📊 Results Summary

Main Results (100 SKUs, 10 Seeds, 200 Episodes)

Scalability Results (500 SKUs, 5 Seeds, 200 Episodes)

🧠 Multi-Category Behavior

🛠️ Installation

Option A: Minimal runtime setup

Option B: Editable package install

Option C: Full development environment

🚀 Quickstart

Run main experiment (100 SKUs)

Run scalability experiment (500 SKUs)

🧪 Reproducibility & Experiments

🧰 Development Commands

📁 Repository Structure

📖 Documentation

🤝 Contributing

📌 Citation

📜 License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github		.github
aairm		aairm
configs		configs
data		data
docs		docs
experiments		experiments
notebooks		notebooks
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

AAIRM

📚 Index

✨ Highlights

🏗️ System Overview

📊 Results Summary

Main Results (100 SKUs, 10 Seeds, 200 Episodes)

Scalability Results (500 SKUs, 5 Seeds, 200 Episodes)

🧠 Multi-Category Behavior

🛠️ Installation

Option A: Minimal runtime setup

Option B: Editable package install

Option C: Full development environment

🚀 Quickstart

Run main experiment (100 SKUs)

Run scalability experiment (500 SKUs)

🧪 Reproducibility & Experiments

🧰 Development Commands

📁 Repository Structure

📖 Documentation

🤝 Contributing

📌 Citation

📜 License

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages