A curated, opinionated list of resources for Product Managers navigating the AI era.
Not everything — just the signal. Filtered by a practicing PM who builds AI products and uses these tools daily.
Maintained by Sumit Soni · LinkedIn
Last updated: 2026-03-17
- AI PM Foundations
- Vibe Coding & Agentic Tools
- CLI Agents · UI Prototyping · Code Editors · Agentic Workflow Tools · PM Skill Libraries
- Prompt Libraries for PMs
- Evals & AI Quality
- Eval Frameworks · Observability · Quality Scorecards
- Building AI Products
- LLM Economics for PMs
- Learning Resources
- Newsletters & Communities
- Open Source PM Tools
Understanding what it means to be a PM in the AI era — not just using AI tools, but owning AI features end-to-end.
- What Is an AI Product Manager? — Paweł Huryn's take on the role. Solid starting point.
- AI PM Learning Roadmap — Structured path from PM → AI PM.
- How to Build a RAG Chatbot — Hands-on guide for PMs who want to build, not just spec.
- How to Create an AI Product Strategy — Strategy frameworks for AI products.
The tools PMs are using to ship without a dedicated engineering team.
- Claude Code — Anthropic's CLI agent. Best for structured SDLC, serious builds, and repo-level context. Skills + hooks + MCP support.
- OpenAI Codex — OpenAI's coding agent. Good for agentic experiments and broad exploration.
- Gemini CLI — Google's CLI. Best for large-context analysis, research, and multimodal tasks.
- v0.dev — Vercel's AI UI generator. Describe → get production-ready React components.
- Bolt.new — Full-stack prototyping in the browser. Clone → deploy in minutes.
- Cursor — AI-native code editor. Best for longer coding sessions with full codebase context.
- GitHub Copilot — In-IDE pair programmer. Good entry point if you're already in VS Code.
- BloopAI/vibe-kanban ⭐ 23k — PM-friendly workflow layer for coding agents. Helps non-engineers structure agent tasks and track progress.
- OpenHands/OpenHands ⭐ 69k — Autonomous coding agent with broad community support and active maintenance.
- Aider-AI/aider ⭐ 42k — Terminal-based coding agent built for iterative product build loops.
- stackblitz-labs/bolt.diy ⭐ 19k — Open-source browser-based full-stack vibe coding. Strong for rapid founder prototyping.
- PrefectHQ/fastmcp ⭐ 23.7k — FastMCP framework for exposing tools and workflows to agents. Useful reference now that MCP is showing up in more founder and PM build conversations.
- phuryn/pm-skills ⭐ 7.3k — 100+ agentic skills for Claude Code, Codex, Gemini CLI, Cursor, and Kiro. Covers discovery → strategy → GTM.
Structured prompts for real PM workflows — not generic "write me a PRD" prompts.
- isumitsoni/pm-prompts — Practical prompt library for every PM workflow: discovery, strategy, execution, AI feature specs, metrics analysis, and career. 8 categories, 53 prompts.
- Claude Code Skills Marketplace — Slash commands and skills for Claude Code that automate PM workflows.
Know a good prompt library? Open a PR.
How to define "good" for an AI feature — the PM's job, not the ML engineer's.
- Anthropic's Guide to Evals — How to build test suites for LLM applications. The starting point.
- AI Evals Course on Maven — Practical eval design for PMs. Highly recommended.
- promptfoo/promptfoo ⭐ 16k — Evals + red-teaming framework PMs can use as a release gate. Runs prompt comparisons and catches regressions.
- confident-ai/deepeval ⭐ 14k — Straightforward framework for building LLM evaluation suites. Good starting point for PM-led eval programs.
- Arize-ai/phoenix ⭐ 8.8k — Observability + eval workflows for production AI features. Useful for ongoing quality monitoring.
- truera/trulens ⭐ 3.1k — Feedback and eval instrumentation for LLM systems.
- langfuse/langfuse ⭐ 23.3k — Open-source observability, evals, prompt management, and datasets. Good PM pick because it makes quality review and regression tracking visible beyond raw logs.
- Helicone/helicone ⭐ 5.3k — LLM observability plus experimentation. Useful for PMs who want to compare prompts, models, and live behavior without building custom dashboards first.
- tensorzero/tensorzero ⭐ 11.1k — Gateway plus observability, optimization, and evaluation in one stack. Strong addition for PMs thinking about production quality and experimentation together.
- vibrantlabsai/ragas ⭐ 13.0k — Evaluation toolkit for LLM and RAG systems. Helpful when PMs need a more structured quality conversation than "the outputs seem better."
- guardrails-ai/guardrails ⭐ 6.5k — Guardrail layer for LLM applications. Good PM resource for trust-sensitive workflows where failure handling matters as much as model performance.
- comet-ml/opik ⭐ 18.3k — Open-source AI observability, evaluation, and optimization. Strong PM pick when you want traces, evals, and prompt iteration in one system instead of separate tools.
- Giskard-AI/giskard-oss ⭐ 5.1k — AI testing framework focused on performance, bias, and security risk. Useful when PMs need a more explicit trust and red-team conversation before rollout.
- Agenta-AI/agenta ⭐ 3.9k — Open-source LLMOps stack for prompt management, evaluation, and observability. Good fit for teams graduating from ad hoc prompt tests into repeatable experimentation.
- Key PM mental model: Define your eval criteria before you build. "The AI should sound like a human" is not an eval. "95% of outputs score ≥4/5 on our rubric" is.
Patterns, frameworks, and hard-won lessons from PMs who ship AI features.
- Vibe Coding vs. Vibe Engineering — Important distinction: vibe coding is prototyping; vibe engineering is production. Know which one you're doing.
- The PM's Guide to RAG — When to use RAG vs fine-tuning vs prompt engineering, explained for PMs.
- Shubhamsaboo/awesome-llm-apps — Working code examples for LLM applications. Clone → run. Great for PMs who want to understand by doing.
- humanlayer/12-factor-agents ⭐ 18.8k — Production principles for LLM-powered software. Strong fit for PMs and founders because it sharpens architecture judgment without feeling like an ML textbook.
| Decision | What to consider |
|---|---|
| RAG vs fine-tuning | RAG for dynamic/private data; fine-tuning for consistent tone/style |
| Model choice | Cost/latency/capability triangle — don't default to the biggest model |
| Streaming vs batch | User experience decision first, then engineering |
| Memory architecture | Session vs persistent vs none — depends on your trust model |
Cost and latency are product decisions. PMs need to own them.
- BerriAI/litellm ⭐ 39.3k — Model gateway with cost tracking, routing, logging, and guardrails. Very relevant for PMs making cost-versus-quality decisions across providers.
- Portkey-AI/gateway ⭐ 10.9k — AI gateway with routing and integrated guardrails. Useful for PMs who need reliability, fallback logic, and vendor flexibility built into product decisions.
- Token pricing reference: Anthropic pricing · OpenAI pricing · Google pricing
- Mental model: Input tokens are cheap. Output tokens cost 3-5x more. Design your prompts accordingly.
- Haiku/Flash for high-volume, low-stakes tasks. Sonnet/Pro for core reasoning. Opus/Ultra for one-shot complex analysis.
- Caching: If your system prompt is long and reused, prompt caching can cut costs 80-90%.
Courses and programs worth your time.
| Resource | What it covers | Who it's for |
|---|---|---|
| AI PM Certification (Maven) | Full AI PM curriculum | PMs transitioning to AI PM |
| AI Evals (Maven) | How to evaluate LLM outputs | PMs owning AI quality |
| fast.ai | Practical deep learning | PMs who want technical depth |
| Anthropic Claude Code Docs | Building with Claude Code | PMs who vibe code |
| microsoft/generative-ai-for-beginners ⭐ 108k | Structured lesson path for AI builders | PMs new to AI development |
| openai/openai-cookbook ⭐ 72k | Implementation patterns and examples | PMs prototyping AI features |
| anthropics/claude-cookbooks ⭐ 35k | Practical Claude workflows and examples | PMs building with Claude API |
High signal-to-noise sources on AI PM.
- The Product Compass — Paweł Huryn. Best AI PM newsletter. Actionable, not hype.
- Unwind AI — Shubham Saboo. Open-source AI builder community. High technical signal.
- Lenny's Newsletter — Not AI-specific, but the best general PM source.
Tools PMs can use, fork, or learn from.
- isumitsoni/purepaste — AI text cleaner. Strips robotic LLM residue from outputs. Live at purepaste.pro.
- Shubhamsaboo/awesome-llm-apps — Working LLM app templates. Great reference for scoping AI features.
This list is maintained by a practicing PM — not a curator scraping the internet.
- Open a PR if you have a resource that genuinely belongs here
- Quality bar: would you recommend this to a senior PM who has 30 minutes? If yes, submit it.
- No self-promotion without real value. No courses you haven't taken.
Star this repo if you find it useful. It helps other PMs find it.