I'm Arifuzzaman Joy, an AI Research Scholar and Machine Learning Engineer with 6+ years of experience building production-ready AI systems. I specialize in transforming theoretical AI concepts into scalable, enterprise-grade solutions β from LoRA fine-tuning for generative models to multi-agent reasoning systems deployed on cloud infrastructure.
π My work LatentMAS-SLoRA is officially featured in Gen-Verse/LatentMAS β a top multi-agent reasoning framework (π€ HuggingFace #1 Paper of the Day).
| Role | Organization | Period |
|---|---|---|
| AI & Machine Learning Engineer (Freelance) | Upwork Β· Fiverr Β· Direct Clients | 2023 β Present |
| Research Assistant | Rajshahi University β Solar Lab / AI Lab | Mar 2022 β May 2023 |
| ERP System Setup & Data Analyst | KBEC, Dhaka | 3-month contract |
- Freelance ML Engineer β Develop and deploy cutting-edge ML/AI models specializing in multi-modal data tasks including image generation, video synthesis, NLP, and voice AI.
- Research Assistant β Conducted research on renewable energy (solar cells) and speech processing; applied ML/DL techniques to analyze simulation data and improve photovoltaic performance.
- ERP & Data Analyst β Implemented Odoo ERP system for business process automation; scraped and organized contact data for niche marketing campaigns.
skills = {
"Languages": ["Python (7+ yrs)", "SQL (5+ yrs)", "JavaScript", "HTML/CSS", "Bash"],
"AI / ML": ["Data Science (6+ yrs)", "Machine Learning (5+ yrs)",
"Deep Learning (5+ yrs)", "Agentic AI (2+ yrs)",
"NLP", "Computer Vision", "Generative AI"],
"Frameworks": ["PyTorch", "TensorFlow", "Hugging Face Transformers",
"Diffusers", "Langchain", "OpenCV", "Selenium",
"LiveKit", "Librosa", "Gradio", "PEFT"],
"DevOps & MLOps": ["Docker (3+ yrs)", "Kubernetes (3+ yrs)",
"CI/CD (3+ yrs)", "Git", "GitHub Actions"],
"Cloud & Infra": ["Azure (3+ yrs)", "AWS (3+ yrs)", "RunPod",
"MongoDB", "Firebase", "MySQL"],
"Spoken Languages": ["English (Fluent)", "Bangla (Native)"],
}
π§© LatentMAS-SLoRAOfficially featured in Gen-Verse/LatentMAS β a leading multi-agent reasoning framework (π€ HuggingFace #1 Paper of the Day, arXiv:2511.20639). Multi-agent reasoning system augmenting LatentMAS with role-specialized, dynamically switchable LoRA adapters for better specialization and adaptability. Features VLM support (Qwen2.5-VL-7B), latent-space collaboration, RAG integration, and RunPod serverless deployment. Key Results: +12% accuracy improvement, 2.7Γ faster inference, 63.6% token reduction vs traditional RAG. Tech: |
Architecture:
Planner β Critic (latent)
β Refiner (latent)
β Judger (text)
|
Published in high-impact SCI/Scopus-indexed journals Β· Google Scholar Β· ORCID Β· ResearchGate
| # | Paper | Journal | IF | Quartile | Year |
|---|---|---|---|---|---|
| 1 | Machine learning assisted revelation of the best performing single hetero-junction thermophotovoltaic cell | Sustainable Energy Technologies and Assessments | 7.1 | Q1 | 2025 |
| 2 | Machine Learning-Enabled performance exploration of AuCuSeβ in thermophotovoltaic cell | Solar Energy | 6.0 | Q1 | 2024 |
| 3 | Numerical studies on a ternary AgInTeβ chalcopyrite thin film solar cell | Heliyon | 4.0 | Q1 | 2023 |
| 4 | Numerical prediction on the photovoltaic performance of CZTS-based thin film solar cell | Nano Select | β | β | 2023 |
| 5 | Unleashing the Power of Open-Source Transformers in Medical Imaging | Int. J. Advanced Computer Science & Applications | 0.7 | β | 2024 |
| 6 | Spectrum estimation for voiced speech using average weighted linear prediction | β | β | β | 2024 |
| 7 | Enhancement of Bone Conducted Speech Using Deep Transfer Learning | β | β | β | 2024 |
|
Self-hosted platform for hyper-realistic image generation and editing using Flux LoRA, Gradio, and Hugging Face Diffusers. Supports multi-image input, 4-bit quantization, and batch processing. Tech: |
Multi-GPU pipeline for high-fidelity image-to-video, text-to-video, and speech-to-video generation. Uses Wan 2.2 with MoE architecture and Gradio UI for self-hosting. Tech: |
|
Web app for speech recognition, translation, and voice cloning across 100+ languages. Supports YouTube processing and real-time translation. Tech: |
Full-stack platform for natural multi-modal conversations with real-time SIP/WebRTC telephony and emotionally expressive AI voice interactions. Tech: |
|
Serverless worker for FLUX.2 Klein 4B text-to-image and image-to-image generation on RunPod. Tech: |
Open-source video generative models optimized for lower VRAM GPUs with web-based interface. Tech: |
|
Fork of Fooocus for offline image generation with fast presets and UI enhancements. Tech: |
Low-cost, cloud-CPU-friendly starter kit for self-hosting AI tools with external sharing. Tech: |
|
Research code for brain tumor classification and segmentation using ConvNeXt V2 and SegFormer. Achieves up to 99.6% diagnostic accuracy. Tech: |
AI chatbot project leveraging large language models for conversational intelligence. Tech: |
| Degree | Institution | Year | Result |
|---|---|---|---|
| B.Sc. in Electrical & Electronic Engineering | University of Rajshahi, Bangladesh | 2017 β 2020 | CGPA 3.13 |
| Higher Secondary Certificate (H.Sc.), Science | Dhaka Education Board | 2015 β 2016 | GPA 5.00 |
- π Deep Learning with TensorFlow β IBM
- π Prompt Engineering for ChatGPT β Vanderbilt University
- π SQL (Advanced) Certificate β HackerRank
- π Introduction to Programming with MATLAB β Vanderbilt University
- π Data, Signal, and Image Analysis with MATLAB β Coursera
"The best way to predict the future is to create it." β Alan Kay
Last updated: 2026-02-17 Β· Built with β€οΈ by Arifuzzaman Joy
